Sheffield, South Yorkshire, Yorkshire, United Kingdom
Essential Consulting
Background and Role We are looking to onboard a Senior Software Engineer (Platform Engineer) on a contract basis for a large global bank. The Senior Software Engineer will join a small, dedicated 4-person team within Infrastructure in the Chief More ❯
handsworth, yorkshire and the humber, united kingdom
Essential Consulting
Background and Role We are looking to onboard a Senior Software Engineer (Platform Engineer) on a contract basis for a large global bank. The Senior Software Engineer will join a small, dedicated 4-person team within Infrastructure in the Chief More ❯
AWS Public Cloud infrastructure and implementation of IaC using Terraform. The role will work closely with the SRE and Engineering teams to ensure that the Cloud environment has sufficient observability and is appropriately managed.Skills and experience required: Strong technical operational skills in supporting AWS Cloud Hosted environments, and at least 3 years in an Infrastructure support role Strong understanding of More ❯
AWS Public Cloud infrastructure and implementation of IaC using Terraform. The role will work closely with the SRE and Engineering teams to ensure that the Cloud environment has sufficient observability and is appropriately managed.Skills and experience required: Strong technical operational skills in supporting AWS Cloud Hosted environments, and at least 3 years in an Infrastructure support role Strong understanding of More ❯
environments. Relevant Skills: Experience working in Agile environments Strong understanding of Site Reliability Engineering (SRE) principles Familiarity with Azure DevOps for CI/CD and pipeline management Knowledge of observability tools: Prometheus, Grafana, Loki, Tempo Experience with Infrastructure as Code: Helm, Kustomize Hands-on experience with Tekton and ArgoCD Ability to support and troubleshoot OpenShift Operators (ServiceMesh, ODF, ACS, ACM More ❯
Solid experience building and deploying services with Java and Spring Boot. Comfort working in a cloud-native environment - Kubernetes (EKS), containers, scaling etc. An interest in observability, using tools like Prometheus and Grafana to keep services healthy and understand usage patterns. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security More ❯
that automate their processes. Contribute to the development of our Virtual Agent development platform that scales with our product strategy. Ensure our AI services maintain high standards of reliability, observability, availability, and performance. Participate in our machine learning community to influence how we implement machine learning and computer vision technologies, shaping Unitary's future. Take ownership of customer outcomes with More ❯
that automate their processes. Contribute to the development of our AI agent development platform that scales with our product strategy. Ensure our AI services maintain high standards of reliability, observability, availability, and performance. Participate in our machine learning community to influence how we implement machine learning and computer vision technologies, shaping Unitary's future. Take ownership of customer outcomes with More ❯
components such as market data feeds, order gateways, execution algorithms, risk engines, UI dashboards, middle office reconciliation, and account infrastructure. We emphasize event-driven, deterministic system design, real-time observability, and strong security. Our tech stack includes Java (low-latency), Python, Web UI (React/Ag-Grid), Aeron, ClickHouse, Kubernetes, and modern CI/CD tooling, with a strong focus More ❯
Code principles Design an agile release engineering strategy that delivers value incrementally and continuously Support a highly-available live production system, respond to alerts, diagnose problems using logs and observability tooling, triage and resolve incidents What we offer We make sure our team is well looked after with generous salaries and a great benefits package which includes: Enhanced pension with More ❯
California, with additional locations across the globe. What you'll do: As a Site Reliability Engineer at Zefr, you'll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring … EKS expected), Helm, Kustomize Service Mesh: Istio CI/CD & Automation: CI/CD Pipelines: GitHub Actions GitOps/Continuous Delivery: Argo CD Primary Scripting/Automation Language: Python Observability & Monitoring: Monitoring & Alerting: Prometheus, Datadog, Pagerduty Telemetry Standards: OpenTelemetry Application & Data Ecosystem (Supporting): Application Languages/Frameworks: Python, FastAPI, Flask, Node.js, React Data Streaming: Apache Kafka Data Processing/Transformation … CircleCI, Argo CD, Flux) Knowledge of IaC and configuration management tools (Terraform, OpenTofu, Crossplane, Pulumi, Ansible, CloudFormation) Strong problem-solving experience, focusing on automation Production experience with Monitoring and Observability tools (Prometheus, Grafana, Datadog, Thanos, New Relic, Open Telemetry) Understanding of Cloud Networking concepts (Mesh Networking, NAT, Load Balancers, SSL Certificates and TLS termination, API Gateways, proxies, etc) Strong written More ❯
for leading and executing the migration of data, dashboards, alerts, and configurations from Splunk systems to Elasticsearch. This role involves deep technical expertise in Splunk architecture, data ingestion, and observability tools, along with strong project management and stakeholder communication skills. Must have skills: -Splunk -ELK Stack -Kibana Nice to have skills: -stakeholder communication skills -strong project management Responsibilities: Minimum number More ❯
for leading and executing the migration of data, dashboards, alerts, and configurations from Splunk systems to Elasticsearch. This role involves deep technical expertise in Splunk architecture, data ingestion, and observability tools, along with strong project management and stakeholder communication skills.Must have skills: -Splunk -ELK Stack -KibanaNice to have skills: -stakeholder communication skills -strong project managementDetailed Job Description: -Ability to deploy More ❯
Milton Keynes, Buckinghamshire, South East, United Kingdom
Interact Consulting Limited
or strong interest in learning) cloud-native tooling: AWS (especially CloudWatch) Artifact Management (e.g., Artifactory, CodeArtifact) Infrastructure as Code with Terraform Monitor test metrics, troubleshoot failures, and improve system observability and debuggability. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tank Recruitment
backend services/databases. Experience with TDD and testing frameworks such as Jest and Pact. Knowledge of CI/CD pipelines (ideally GitHub Actions). Hands-on experience with observability/monitoring tools (e.g., DataDog). A proactive, problem-solving mindset with the ability to work both independently and collaboratively. Senior Software Engineer Location: London (Hybrid 2 Days a week More ❯
of student lifecycle processes in Higher Education and relevant data domains. Knowldge of event-driven and message-based architectures (Event Hub, Kafka, or Service Bus) Experience with monitoring and observability tools like Azure Monitor, Application Insights, and Log Analytics. Awareness of data security, GDPR, and compliance in educational or public sector environments. Exposure to OpenAPI/Swagger, API lifecycle management More ❯
and maintain secure, event-driven integrations via webhooks and callback mechanisms Lead the backend design of new products and services, ensuring interoperability and long-term maintainability Optimize for resilience, observability, and fault-tolerant behavior across distributed cloud systems Deploy infrastructure using AWS CDK and build with modern tools like Golang, TypeScript, Python, and PHP Ensure clean interface contracts and clear More ❯
Tool to production by building and supporting ML-driven applications. Furthering Developer Experience (DevEx) by mentoring others in writing code that is intuitive, clear, and easy to test Developing observability for new and existing ML applications and GenAI/LLM integrations, making use of the Grafana Stack (Prometheus, Loki, Tempo) Working closely with Data Scientists and ML Engineers throughout the More ❯
scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Define service level objectives (SLOs) and key performance indicators (KPIs More ❯
Oldham, Greater Manchester, North West, United Kingdom
Innovative Technology
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
Wokingham, Berkshire, South East, United Kingdom Hybrid / WFH Options
Sanderson Government and Defence
for a sharp-minded Site Reliability Engineer to join our cloud-native mission in Azure. If you thrive in Agile teams, live for automation, and know your way around observability stacks and CI/CD pipelines - this is your playground. What you'll be doing: Automating deployment, monitoring & infrastructure with precision Owning platform reliability, performance & SLAs Building IaC with Helm More ❯
South West, England, United Kingdom Hybrid / WFH Options
Interquest
platform - Working deeply with Kubernetes (AKS) alongside a highly skilled team of specialists - Supporting and delivering projects in collaboration with the wider delivery team - Driving best practices around automation, observability, scalability, and performance - Balancing projects with operational support across a growing international environment Whats in this for you? - Salary up to £60k+ - Fully remote global team (no plans to ever More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Huxley
availability, secure deployments, and efficient agent orchestration using AKS. You will create and maintain CI/CD pipelines for Azure services, Semantic Kernel agents, manage Kubernetes clusters, and integrate observability tools to monitor system health and performance. You'll also ensure alignment with enterprise-grade security practices, including zero trust principles, identity-aware routing, and integration with Azure API Management More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Huxley Associates
availability, secure deployments, and efficient agent orchestration using AKS. You will create and maintain CI/CD pipelines for Azure services, Semantic Kernel agents, manage Kubernetes clusters, and integrate observability tools to monitor system health and performance. You'll also ensure alignment with enterprise-grade security practices, including zero trust principles, identity-aware routing, and integration with Azure API Management More ❯