Observability Jobs in the UK

626 to 650 of 873 Observability Jobs in the UK

DevOps Engineer

Birmingham, England, United Kingdom
Explore Group
next-generation AI products. You’ll join a small, experienced team developing an internal Kubernetes-based platform that enables AI innovation across the organisation automating everything from deployments to observability, and helping developers build smarter applications with confidence. What you’ll be doing: Designing, deploying, and maintaining Azure Kubernetes (AKS) environments Managing Infrastructure as Code with Terraform and improving GitOps … workflows (ArgoCD/GitHub Actions) Building observability and monitoring stacks using Prometheus, Grafana, and Loki Supporting AI workloads (LLMs, RAG, and document processing applications) running on Kubernetes Automating platform operations with Python, Go, and shell scripting Implementing security guardrails, PII compliance tooling, and best practices for production AI systems What you’ll need: 3+ years’ experience in DevOps or Platform … Engineering Strong background in Azure and Kubernetes Hands-on experience with Terraform, CI/CD, and container orchestration Familiarity with observability tools (Prometheus, Grafana, Loki) Scripting or programming skills in Python or Go Interest in AI infrastructure, LLMOps, or large language model deployment More ❯
Posted:

Site Reliability Engineer

Wigan, Lancashire, England, United Kingdom
Hybrid/Remote Options
Searchability
As part of their continued investment in reliability and platform performance, they are now seeking an experienced Site Reliability Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments … and performance of complex online platforms supporting high-volume transactions. Working closely with operations and product teams, you'll monitor production systems, develop automation to improve uptime, and refine observability to provide real-time insight into platform health. You'll also play a key role in performance testing, system tuning and incident management to ensure smooth operation during critical events. … SITE RELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and debugging Hands-on experience with AWS or another major cloud platform Knowledge of Kubernetes, Terraform, and Infrastructure More ❯
Employment Type: Full-Time
Salary: £40,000 per annum
Posted:

Site Reliability Engineer

Wigan, Greater Manchester, United Kingdom
Hybrid/Remote Options
Searchability (UK) Ltd
As part of their continued investment in reliability and platform performance, they are now seeking an experienced Site Reliability Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments … and performance of complex online platforms supporting high-volume transactions. Working closely with operations and product teams, you'll monitor production systems, develop automation to improve uptime, and refine observability to provide real-time insight into platform health. You'll also play a key role in performance testing, system tuning and incident management to ensure smooth operation during critical events. … SITE RELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and debugging Hands-on experience with AWS or another major cloud platform Knowledge of Kubernetes, Terraform, and Infrastructure More ❯
Employment Type: Permanent
Salary: £40000/annum
Posted:

Site Reliability Engineer

Hereford, Herefordshire, England, United Kingdom
Hybrid/Remote Options
Hays Specialist Recruitment Limited
role focused on ensuring service availability, performance, and cost-efficiency across both cloud and on-prem infrastructure.You'll work closely with development and support teams to evolve infrastructure, enhance observability, and proactively mitigate reliability risks.Key Responsibilities:Collaborate with software engineers to improve reliability and performanceAutomate operational tasks and reduce alert fatigueEnhance monitoring and observability to pre-empt issuesSupport development environments … protocolsExperience with cloud platforms, ideally AWS (EC2, RDS, S3, Lambda)Desirable:Coding experience in Java, Go, Python or similarKnowledge of cross-domain technologiesExperience in service management environmentsPractical application of observability patternsExperience with AzureAdditional Information:Due to the nature of the work, successful candidates will be required to undergo security vetting.We welcome applications from all backgrounds and are committed to creating More ❯
Employment Type: Contractor
Rate: £500 - £600 per day
Posted:

SysOps Engineer

United Kingdom
Hybrid/Remote Options
CloudSeekers
cloud environments • Automating day to day processes using Python and Bash • Building and maintaining infrastructure as code with tools such as Terraform, CloudFormation and Ansible • Enhancing monitoring, alerting and observability using tools like Zabbix, Grafana and ELK • Deploying and supporting containerised workloads • Troubleshooting complex systems and contributing to continuous improvement • Collaborating with engineering teams to deliver stable and scalable solutions … scripting skills in Python or Bash • Understanding of AWS or similar cloud platforms • Experience with IaC tooling such as Terraform, CloudFormation, Puppet or Ansible • Good knowledge of monitoring and observability tooling • Ability to diagnose and resolve issues in hybrid environments • A proactive mindset and willingness to automate everything that can be improved Nice to have • Experience working within regulated environments More ❯
Posted:

DevOps Engineer – Global Multi-Strategy Hedge Fund – Industry Leading Comp Package

London Area, United Kingdom
Mondrian Alpha
in London. Working alongside software and cybersecurity engineers, you’ll help design, build, and automate a hybrid multi-cloud estate across AWS and Azure—enhancing CI/CD pipelines, observability, and developer experience. You’ll take ownership of business-critical infrastructure, shaping cloud strategy end-to-end and collaborating with global teams across the US and Europe to drive efficiency … CI/CD pipelines through tools such as Azure DevOps, GitHub Actions, or Octopus. You’ll also be adept at automating workflows in Python or PowerShell and implementing modern observability solutions including DataDog, OpenSearch, and LogicMonitor. This is a rare opportunity to join a high-performing, global hedge fund where technology and engineering directly drive investment performance and operational scale. More ❯
Posted:

DevOps Engineer – Global Multi-Strategy Hedge Fund – Industry Leading Comp Package

City of London, London, United Kingdom
Mondrian Alpha
in London. Working alongside software and cybersecurity engineers, you’ll help design, build, and automate a hybrid multi-cloud estate across AWS and Azure—enhancing CI/CD pipelines, observability, and developer experience. You’ll take ownership of business-critical infrastructure, shaping cloud strategy end-to-end and collaborating with global teams across the US and Europe to drive efficiency … CI/CD pipelines through tools such as Azure DevOps, GitHub Actions, or Octopus. You’ll also be adept at automating workflows in Python or PowerShell and implementing modern observability solutions including DataDog, OpenSearch, and LogicMonitor. This is a rare opportunity to join a high-performing, global hedge fund where technology and engineering directly drive investment performance and operational scale. More ❯
Posted:

Head of Software Engineering

Manchester, England, United Kingdom
Socium
modular, cloud-native platform (Azure/AWS – your call) Driving a culture shift across engineering – CI/CD, SRE, DevEx, clean code Setting engineering standards: quality gates, testing practices, observability, automation Working with Product, Delivery, and Exec teams to align on priorities and timelines What we’re looking for: Senior engineering leader (Head of/Director level) with strong architectural … design, modern API practices (REST, gRPC) TypeScript across services and frontend (frameworks are flexible) Infrastructure as Code (Terraform) CI/CD baked in, GitOps model preferred Emphasis on testing, observability, and secure-by-design Why this role? You’ll join a profitable, well-backed SaaS business with real scale — and a brief to modernise how engineering operates from the ground More ❯
Posted:

AWS Solutions Architect (Python Background)

London Area, United Kingdom
Hybrid/Remote Options
Robert Half
experience at module level, including policy-as-code (OPA/Conftest) Proven design and delivery using ECS Fargate or EKS with secure image scanning and runtime security Experience building observability frameworks – metrics, logging, tracing, retention, and access control Strong understanding of security and compliance requirements in financial services (FCA, DORA, PRA) Familiarity with CI/CD pipelines (GitHub Actions, Jenkins … strong architect who codes, comfortable working across architecture, engineering, and data science teams to deliver compliant, cloud-native patterns that scale. You’ll design landing zones, reusable modules, and observability frameworks, all aligned to enterprise controls and best practice. All candidates must complete standard screening (Right to Work, DBS, credit/sanctions, employment verification). This is an exciting opportunity More ❯
Posted:

AWS Solutions Architect (Python Background)

City of London, London, United Kingdom
Hybrid/Remote Options
Robert Half
experience at module level, including policy-as-code (OPA/Conftest) Proven design and delivery using ECS Fargate or EKS with secure image scanning and runtime security Experience building observability frameworks – metrics, logging, tracing, retention, and access control Strong understanding of security and compliance requirements in financial services (FCA, DORA, PRA) Familiarity with CI/CD pipelines (GitHub Actions, Jenkins … strong architect who codes, comfortable working across architecture, engineering, and data science teams to deliver compliant, cloud-native patterns that scale. You’ll design landing zones, reusable modules, and observability frameworks, all aligned to enterprise controls and best practice. All candidates must complete standard screening (Right to Work, DBS, credit/sanctions, employment verification). This is an exciting opportunity More ❯
Posted:

Senior Platform Engineer

Oxford, England, United Kingdom
SR2 | Socially Responsible Recruitment | Certified B Corporation™
for monitoring, security, and performance. Contribute to architectural decisions and technical design reviews. Ensure compliance with secure coding standards (OWASP, API security, web application best practices). Support automation, observability, and continuous improvement initiatives across the engineering organisation. ✅ You’ll Be a Great Fit If You... Have strong coding experience (Python, TypeScript, Go, JavaScript, C#, or similar). Bring solid … great). Care deeply about building reliable systems that make a real-world impact. 💡 Bonus Points For Experience working in AI, Biotech, or other data-intensive domains. Familiarity with observability stacks and monitoring best practices. Interest in security automation and platform resilience. A collaborative mindset and passion for mentoring or knowledge sharing. 🙌 What’s on Offer More ❯
Posted:

AWS Solution Architect Python Background

London, South East, England, United Kingdom
Hybrid/Remote Options
Robert Half
experience at module level, including policy-as-code (OPA/Conftest) Proven design and delivery using ECS Fargate or EKS with secure image scanning and runtime security Experience building observability frameworks - metrics, logging, tracing, retention, and access control Strong understanding of security and compliance requirements in financial services (FCA, DORA, PRA) Familiarity with CI/CD pipelines (GitHub Actions, Jenkins … strong architect who codes, comfortable working across architecture, engineering, and data science teams to deliver compliant, cloud-native patterns that scale. You'll design landing zones, reusable modules, and observability frameworks, all aligned to enterprise controls and best practice. All candidates must complete standard screening (Right to Work, DBS, credit/sanctions, employment verification). This is an exciting opportunity More ❯
Employment Type: Contractor
Rate: £550 - £600 per day
Posted:

Solutions Architect

London Area, United Kingdom
Hybrid/Remote Options
Teranode Group
Guide engineers through discovery → design → build → rollout for complex initiatives. Technical Stewardship Champion multi-tenant patterns, API-first design, and stateless, horizontally scalable services. Ensure smooth CI/CD, observability, and operability in partnership with DevOps/SRE. Promote performance, cost, and reliability trade-offs with clear rationale. Security, Reliability & Compliance Embed security-by-design (mTLS/service mesh, secret … Argo CD, Vault/HSM) Platform services (API Gateway, RBAC/SSO, rate limiting, policy enforcement) Event-driven integration (REST/gRPC, idempotency, NATS JetStream and/or Kafka) Observability stacks (OpenTelemetry, Prometheus, Loki, alerting/runbooks) Multi-region resilience (traffic steering, failover, DR testing) Data & storage (PostgreSQL/Distributed SQL, object storage/S3, caching/Redis) Identity & trust More ❯
Posted:

Solutions Architect

City of London, London, United Kingdom
Hybrid/Remote Options
Teranode Group
Guide engineers through discovery → design → build → rollout for complex initiatives. Technical Stewardship Champion multi-tenant patterns, API-first design, and stateless, horizontally scalable services. Ensure smooth CI/CD, observability, and operability in partnership with DevOps/SRE. Promote performance, cost, and reliability trade-offs with clear rationale. Security, Reliability & Compliance Embed security-by-design (mTLS/service mesh, secret … Argo CD, Vault/HSM) Platform services (API Gateway, RBAC/SSO, rate limiting, policy enforcement) Event-driven integration (REST/gRPC, idempotency, NATS JetStream and/or Kafka) Observability stacks (OpenTelemetry, Prometheus, Loki, alerting/runbooks) Multi-region resilience (traffic steering, failover, DR testing) Data & storage (PostgreSQL/Distributed SQL, object storage/S3, caching/Redis) Identity & trust More ❯
Posted:

Back End Developer

Northern Ireland, United Kingdom
Hybrid/Remote Options
Ocho
and evolve gRPC/REST APIs powering customer-facing features and internal platforms * Implement event-driven components (queues/streams) and caching to meet latency and throughput SLOs * Add observability (metrics, logs, traces), performance profiling, and production readiness * Collaborate with product, DS, and frontend on API shape, contracts, and release cadence Essential Criteria: * 3+ years' backend experience (2+ in Go … stores, analytics workloads) Nice to Have: * Experience integrating Python ML services (TF/torch), feature stores, or model registries * Event streaming & messaging (Kafka, Kinesis, NATS, SQS, Pub/Sub) * Observability stacks (OpenTelemetry, Prometheus, Grafana) * IaC (Terraform), security-by-design, OAuth/OIDC, secrets management * Batch & streaming data processing (Spark/Flink/Beam) or columnar analytics (Parquet/Arrow) Career More ❯
Posted:

Senior Full Stack Software Engineer (SaaS Platform) - Pathogen

Oxford, Oxfordshire, United Kingdom
Ellison Institute of Technology
enabling step-changes in global genomics research and clinical practice. This role combines hands on software engineering with technical oversight, mentoring junior engineers and driving improvements in standards, security, observability, and overall product quality. As part of a cross functional team, you'll collaborate with engineers, product managers, bioinformaticians, and platform specialists to deliver secure, reliable, and high quality software. … engineers. Ensure scalability and reliability of the overall solution, handling large amounts of genomic, and other multi-modal data. Drive best practices for security, testing, CI/CD, and observability across the stack. Enhance performance and responsiveness of user interfaces for data-heavy applications. Champion usability, ensuring interfaces meet the needs of clinicians and researchers. Continuously improve development workflows through More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Director of Engineering

Manchester Area, United Kingdom
Hybrid/Remote Options
Lynx Recruitment
Director of Next Generation Engineering Salary: Up to £150,000 + bonus + benefits Location: Manchester - Hybrid working About the Role We're working with a leading AI and technology innovation consultancy that helps organisations design and deliver intelligent, data More ❯
Posted:

Site Reliability Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid/Remote Options
Willis Towers Watson
Description We are looking for an experienced Site Reliability Engineer to join the Igloo team in Cambridge to champion observability and delivery. The candidate should have strong communication skills, experience in coaching or sharing knowledge, and proficiency in Azure and Observability platforms. Join Insurance Consulting and Technology (ICT) during a transformative period aimed at enhancing customer and business value. You … new and exciting uses of their technology. This role will have the opportunity to help the team and product deal with exciting, complex and large-scale client propositions where observability will be essential and help transform how the product is designed and deployed. You will join a cross-team guild of Site Reliability Engineers, which enables you to not only … influence direction within your product family, but to also help shape how we handle observability and monitoring across ICT. This role is open to flexible and hybrid working arrangements, with presence in the Cambridge office a minimum of two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Support Executive

City of London, London, United Kingdom
Hybrid/Remote Options
Queen Square Recruitment
Support Executive (Observability & Reliability Engineering) Location: London – Heathrow (Hybrid, 4 days onsite) Contract Type: 6-Month Contract Day Rate: £375 (Inside IR35) About the Role Our client, a top global organization, is seeking a Support Executive with strong experience in observability, reliability engineering, and cross-functional delivery coordination . In this role, you’ll ensure seamless digital experiences by implementing … unified observability frameworks and optimizing service reliability across multiple product teams. You’ll collaborate with Digital Product, Engineering, and Operations teams to define SLIs, SLOs, and error budgets, implement AI-driven monitoring, and continuously improve incident response and system performance. Key Responsibilities Drive a single-pane-of-glass observability approach for customer journey monitoring. Coordinate release schedules and dependencies across … multiple product teams. Define and implement observability standards — metrics, dashboards, alerts, and SLOs. Map and monitor end-to-end customer journeys to identify risks and performance issues. Collaborate with DevOps, Engineering, and Operations to improve service reliability . Integrate AI-driven analytics to predict incidents and reduce mean time to resolution (MTTR). Lead post-incident reviews and implement reliability More ❯
Posted:

Senior Support Executive

London Area, United Kingdom
Hybrid/Remote Options
Queen Square Recruitment
Support Executive (Observability & Reliability Engineering) Location: London – Heathrow (Hybrid, 4 days onsite) Contract Type: 6-Month Contract Day Rate: £375 (Inside IR35) About the Role Our client, a top global organization, is seeking a Support Executive with strong experience in observability, reliability engineering, and cross-functional delivery coordination . In this role, you’ll ensure seamless digital experiences by implementing … unified observability frameworks and optimizing service reliability across multiple product teams. You’ll collaborate with Digital Product, Engineering, and Operations teams to define SLIs, SLOs, and error budgets, implement AI-driven monitoring, and continuously improve incident response and system performance. Key Responsibilities Drive a single-pane-of-glass observability approach for customer journey monitoring. Coordinate release schedules and dependencies across … multiple product teams. Define and implement observability standards — metrics, dashboards, alerts, and SLOs. Map and monitor end-to-end customer journeys to identify risks and performance issues. Collaborate with DevOps, Engineering, and Operations to improve service reliability . Integrate AI-driven analytics to predict incidents and reduce mean time to resolution (MTTR). Lead post-incident reviews and implement reliability More ❯
Posted:

Lead DevOps Architect

London, United Kingdom
Stott & May Professional Search Limited
week) Day Rate: Market rate (Inside IR35) Contract Duration: 6 months Role Summary We are looking for an experienced DevOps Lead/Architect to design, implement, and maintain scalable observability and cloud infrastructure click apply for full job details More ❯
Employment Type: Contract
Rate: GBP 750 - 800 Daily
Posted:

Fin Ops Engineer

London, United Kingdom
Experis
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture click apply for full job details More ❯
Employment Type: Permanent
Salary: GBP 85,000 Annual
Posted:

Fin Ops Engineer

London, UK
Experis
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture. Experience in some of the following would be ideal Partner ... More ❯
Posted:

Fin Ops Engineer

United Kingdom, UK
Experis
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture. Experience in some of the following would be ideal Partner ... More ❯
Posted:

Software Engineer - Backend & Integrations

Manchester, Lancashire, United Kingdom
Mayfleet Recruitment Limited
for background tasks and failure retries Rate-limit handling and backoff strategies Cloud deployment experience with strong security hygiene, configuration management, and secrets handling Exposure to monitoring, logging, and observability tools The ideal candidate values reliability, secure coding practices, graceful failure, and clean software architecture suitable for a globally distributed gaming audience. More ❯
Employment Type: Contract
Rate: GBP Daily
Posted:
Observability
10th Percentile
£56,593
25th Percentile
£67,500
Median
£80,000
75th Percentile
£105,000
90th Percentile
£140,250