Observability Jobs in the UK excluding London

51 to 75 of 1,235 Observability Jobs in the UK excluding London

DevOps Engineer

Portsmouth, England, United Kingdom
Hybrid / WFH Options
Trust In SODA
through the entire development life cycle. Infrastructure-as-code Bash Delivery methods and techniques, including agile scrum experience. Desirable Skills: RedHat OpenShift Hashicorp (such as Terraform, Packer, Vault) Ansible Observability (such as Prometheus, Grafana, Splunk) Containerised services (such as Postgres, Redis, Kafka, Keycloak, Elk) Experience of doing all the above at OS or S level YAML based pipelines. Immutable infrastructure More ❯
Posted:

Site Reliability Engineer III

Glasgow, Scotland, United Kingdom
JPMorgan Chase & Co
recognize road blocks and demonstrates interest in learning technology that facilitates innovation Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, Terraform Experience in at least one observability tool such as Dynatrace, Datadog, New Relic, CloudWatch, AppDynamics, Splunk., Preferred Qualification Experience a plus in common SRE toolchains: Grafana, Prometheus, Elasticsearch, Kibana, Jaeger. #J-18808-Ljbffr More ❯
Posted:

Applications Support Senior Analyst - AVP (Belfast)

Belfast, Northern Ireland, United Kingdom
Citigroup Inc
the business succeed. Provide timely and effective technical support for end users of a designated set of DevOps tools, encompassing traditional tools (e.g., CI/CD platforms, monitoring and observability tools, source code management systems) and GenAI-powered tools. Troubleshoot and resolve complex technical issues involving in-depth analysis of logs, configurations, system behaviour. Proactively monitor the health, performance, and More ❯
Posted:

DV Cleared DevOps Engineer

Bristol, Gloucestershire, United Kingdom
Hybrid / WFH Options
Curo Resourcing Ltd
domain adjacent technologies/services, such as: Docker, OpenShift, Kubernetes etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Excellent knowledge of YAML or similar languages The following Technical Skills & Experience would be desirable More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

Leeds, United Kingdom
Hybrid / WFH Options
Sportserve
Linux - RHEL and Debian based flavours. Cloud - Oracle Cloud. CI/CD - gitlab-ci, Jenkins, Ansible, Terraform, Helm, Fluxcd. Web servers - Nginx. Caching - Redis. Messaging queues - Kafka. Monitoring and Observability - ELK, Grafana, Prometheus, Thanos. Traffic Management - Haproxy, Keepalived, Cloud Load Balancing, etc. Scripting - Bash, Python. Virtualisation and Orchestration - Docker, Kubernetes. Databases - Cloud Managed MySQL. Desirable: OS Administration - Linux - Debian based More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer III - Markets

Belfast, Northern Ireland, United Kingdom
Hybrid / WFH Options
CME Group
both independently and collaboratively. Key Responsibilities Collaborate with senior SREs and Product engineering teams to monitor, maintain, and troubleshoot our Markets systems. Collaborate with Product teams to continuously improve observability and alerting of our applications to enable data-driven business decision, faster issue detection and incident resolution. Take accountability for delivery of moderately-complex features. Lead technical discussions for own More ❯
Posted:

Lead Software Engineer

Bristol, England, United Kingdom
Hybrid / WFH Options
Lloyds Banking Group
collaboration skills, with the ability to influence and align diverse teams on a shared vision. Knowledge of DevOps practices and tools CI/CD pipelines. Knowledge of Monitoring and Observability tooling. In addition, any experience of these would be useful: Experience with data mesh concepts (e.g., domain-driven ownership, and data product thinking). Expertise in GCP services, including BigQuery More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Salford, England, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Cardiff, Wales, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Glasgow, Scotland, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Newcastle upon Tyne, England, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Software Engineer - Observability (Remote Scotland)

Dundee, Angus, United Kingdom
Hybrid / WFH Options
Ivanti
user experience. This department plays a pivotal role in shaping the company's growth trajectory through continuous innovation and customer-centric solutions. What You Will Be Doing Assist in Observability Implementation: Support the development and maintenance of monitoring, logging, and tracing solutions. Monitor & Manage Observability Tools: Help deploy and manage observability platforms such as Azure Application Insights (AppInsights), New Relic … Resolution) and reduce false positives. Ensure Cloud & Infrastructure Visibility: Contribute to scalable monitoring solutions for AWS and Azure environments. Collaborate with DevOps & SRE Teams: Work with teams to integrate observability best practices into CI/CD pipelines. Documentation & Knowledge Sharing: Contribute to runbooks, dashboards, and best practice guides to support observability initiatives. To Be Successful in The Role, You Will … Have Required Qualifications: 3-5 years of experience in observability, monitoring, or DevOps-related roles. Basic experience with monitoring tools such as Azure AppInsights, New Relic, Prometheus, and Grafana. Understanding of OpenTelemetry, New Relic, AppInsights APM for telemetry data collection. Familiarity with AWS and Azure cloud environments. Exposure to Kubernetes and container monitoring. Basic scripting knowledge (Python, Go, Bash, or More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Technical Account Manager

Slough, England, United Kingdom
JR United Kingdom
/join with: ? London, UK | ? Full-time | ? Senior-Level I'm hiring for a Technical Account Manager on behalf of a high-growth SaaS company building a next-generation observability platform. Their technology helps engineering teams monitor, analyse, and act on their logs, metrics, traces, and security data — improving performance and cutting observability spend. This is a senior, customer-facing … technical role ideal for someone with a background in cloud infrastructure, observability tools, and DevOps. You’ll play a key role in onboarding, supporting, and expanding relationships with enterprise customers — from hands-on implementation to strategic advisory. ? What You’ll Be Doing: Own the technical onboarding journey for new customers — from data integration to configuration and enablement. Work closely with … DevOps, SREs, and engineering teams to understand requirements and deliver high-impact observability solutions. Troubleshoot complex infrastructure issues (Kubernetes, Docker, pipelines, etc.) and advise on best practices. Act as a trusted technical advisor , providing guidance on implementation, optimisation, and long-term success. Partner with sales and customer success teams on renewals, expansions, and QBRs. What You Bring: Strong hands-on More ❯
Posted:

SR Site Reliability Engineer

Mablethorpe, England, United Kingdom
Wakapi
systems using load balancing, auto-scaling, canary releases, and blue-green deployments. Develop and maintain monitoring and logging dashboards with tools like New Relic, Prometheus, Grafana, and Datadog, ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance requirements. Monitor, optimize, and ensure system reliability and performance … like Terraform. Strong understanding of scalability, high availability patterns, and DevOps metrics such as DORA. Knowledge of SLM metrics (SLAs, SLOs, SLIs) and their application. Experience with monitoring and observability tools like New Relic, Prometheus, Grafana, and Datadog. Experience working with Kafka and improving performance in event-driven, real-time data architectures. Familiarity with cloud providers like AWS, Azure, or … GCP. Experience with CI/CD tools such as GitHub Actions, Jenkins, or GitLab CI. Strong analytical and communication skills. Nice-to-haves Familiarity with Observability-as-Code tooling and practices. Knowledge of Chaos Engineering practices. Senior Level: Mid-Senior, Employment: Full-time, Industry: Software Development #J-18808-Ljbffr More ❯
Posted:

Technical Account Manager - DevOps Specialist

Slough, England, United Kingdom
JR United Kingdom
wide Job Description: Technical Account Manager - DevOps Specialist London - Hybrid (2 days per week in office) · Full-time · Senior About the company My client are rebuilding the path to observability using a real-time streaming analytics pipeline that provides monitoring, visualization, and alerting capabilities without the burden of indexing. By enabling users to define different data pipelines per use case … we provide deep Observability and Security insights, at an infinite scale, for less than half the cost. About the Position Technical Account Managers in my client are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way possible. We are looking for hard-working, sharp, and … humble professionals with proven technical customer-facing experience. Their Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and accurately solve More ❯
Posted:

DevOps Engineer - AWS

South East London, England, United Kingdom
Hybrid / WFH Options
Cognitive Group | Part of the Focus Cloud Group
and service incidents with root cause analysis and preventive measures. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. Manage and optimize CI/CD pipelines for automated testing, deployment, and rollback strategies. … AWS services at the DevOps Engineer level Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/ More ❯
Posted:

AI ML Lead Site Reliability Engineer

Glasgow, Scotland, United Kingdom
JPMorgan Chase & Co
during major incidents, quickly identifying and resolving issues to prevent financial losses. Partner with product engineering teams to ensure AI/ML systems are reliable and high-performing. Develop observability, security, automation, and fin-ops tools and orchestration solutions. Provide strategic technology leadership by defining standards and architectures for reliability and automation frameworks. Build strong cross-functional relationships to deliver … least one programming language such as Python, Java Spring Boot, or .Net. Deep knowledge of software applications and technical processes, with emerging expertise in specific technical disciplines. Experience with observability tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, including monitoring, SLO alerting, and telemetry collection. Proficiency with CI/CD tools such as Jenkins, GitLab, Terraform. Experience with containerization and orchestration … drive. Preferred qualifications, capabilities, and skills Experience in AI, ML, or Data engineering. Expertise in Kubernetes and container orchestration. Experience developing automation frameworks or AI Ops solutions. Experience building observability and telemetry tools. About Us J.P. Morgan is a global leader in financial services, providing strategic advice and products to prominent clients worldwide. We value diversity and inclusion, and are More ❯
Posted:

Site Reliability Engineer

Reigate, England, United Kingdom
Hybrid / WFH Options
Willis Towers Watson
Engineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms. You will be joining Insurance Consulting and Technology (ICT) at an exciting time of transformation as we work on improving the delivery of value for customers and the business. … office up to two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing services Maintain and configure observability platforms such as Datadog Proactive monitoring of production and other environments to ensure stability, availability, security and integrity Design and implement automation and processes to improve the efficiency and effectiveness … DevOps Experience of running 24x7 services in a public cloud, ideally Azure Deep understanding of cloud infrastructure and services, including best practices for monitoring, scaling, and security Experience with observability platforms such as Datadog or similar tools Strong interpersonal skills, with the ability to work effectively with many stakeholders Solid verbal and written communication skills, and the ability to present More ❯
Posted:

DevOps Engineer

Glasgow, Scotland, United Kingdom
ELLIOTT MOSS CONSULTING PTE. LTD
Develop and optimize CI/CD pipelines using GitHub Actions, ensuring fast and reliable software delivery. · Manage containerized applications using Docker, Kubernetes, Amazon EKS, and Helm. · Administer and enhance observability using log aggregation and monitoring tools such as CloudWatch, Splunk, and Datadog. · Maintain and manage artifact repositories (e.g., JFrog Artifactory) and ensure effective dependency management. · Automate and streamline system operations … plus. Requirements: · 3+ years of practical experience with AWS cloud services and infrastructure management. AWS certifications are advantageous. · Strong experience with Infrastructure as Code tools (Terraform, CloudFormation) · Familiarity with observability and monitoring tools (CloudWatch, Splunk, Datadog). · Experience managing CI/CD workflows, especially with GitHub Actions. · Strong knowledge of artifact repository management systems like JFrog. · Proficient in Linux administration More ❯
Posted:

Infrastructure Engineer

Edinburgh, United Kingdom
慨正橡扯
customers consume our products. Additionally, you'll: People manage a team, developing skillsets and capabilities to support strategic outcomes Develop technical skills through continuous learning and development Support strategic observability, maintaining a strong awareness of service, creating operational views of data, and supporting the development of targets for the team to deliver against Provide operational support for product and service … would experience of Python, Terraform, Ansible, and PowerShell. Ideally, you'll also have experience in data centre networking, including software-defined networking. Furthermore, you'll need: Experience of using observability tools and techniques with the ability to use data, information, and user sentiment to continuously improve solutions In depth public cloud vendor knowledge covering GCP, AWS, and Azure, Extensive experience More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Snr. Site Reliability Engineer (Remote) (Position located in Sheffield, United Kingdom)

Sheffield, England, United Kingdom
Hybrid / WFH Options
KnowBe4
Responsibilities: Manage and maintain environments to ensure high availability and security. Design and implement CI/CD pipelines to automate software delivery. Monitor and troubleshoot system performance issues, using observability tools like Prometheus, Grafana, or Datadog . Collaborate with development teams to align infrastructure efforts with project needs and timelines. Build and maintain infrastructure as code (IaC) solutions using tools … automated pipelines for continuous delivery. AWS or Azure Cloud Expertise: Strong knowledge of AWS/Azure services, Infrastructure-as-Code: Proficiency in Terraform, Ansible, or similar tools. Monitoring and Observability: Experience with Prometheus, Grafana, Datadog, or other observability platforms. Automation and Scripting: Proficiency in Python, Bash, or other scripting languages to automate tasks. Incident Management: Ability to lead incident response More ❯
Posted:

Senior DevOps Engineer

Woking, England, United Kingdom
Fletcher Chase
that affect millions of users Design and implement Infrastructure as Code solutions that set industry standards Build resilient CI/CD pipelines using Bitbucket and Spacelift orchestration Develop sophisticated observability strategies with Grafana , CloudWatch , and advanced monitoring tools Leadership & Growth Opportunities Mentor emerging DevOps talent and shape team culture Influence architectural decisions across cross-functional teams Drive strategic initiatives that … TypeScript capabilities (this is code-heavy DevOps) Cloud Platforms : Recent AWS experience with enterprise-scale deployments CI/CD Mastery : Advanced experience with Jenkins, Bitbucket Pipelines, and orchestration tools Observability : Hands-on expertise with Grafana, Splunk, CloudWatch for proactive monitoring Leadership & Delivery: Proven track record architecting scalable, secure infrastructure solutions Experience implementing advanced security measures across DevOps workflows Large-scale More ❯
Posted:

Infrastructure Engineer

York, Yorkshire, United Kingdom
Polo's Point S Tire
performance issues Managing regular patching and upgrade cycles for Infrastructure and Software Managing security vulnerabilities and performing platform hardening activities Developing automation to remove manual tasks Developing and maintaining observability dashboards and alerting Collaborating with Software Engineers and Users across the business Required skills and experience: Strong knowledge of at least one Public Cloud provider: Azure, AWS or GCP (Managed … Compute, Networking, RBAC/IAM) Prior experience in Linux system administration in a production environment Prior experience in provisioning and operating Kubernetes clusters in a production environment Experience in observability with Grafana with a good understanding of PromQL and LogQL Good knowledge of using Infrastructure-as-Code solutions such as Terraform Comfortable with scripting for automation using Bash and Python More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Solution Architect

Alderley Edge, England, United Kingdom
Medirest Signature
understanding of modern architecture methods and patterns. Composable Architecture based on MACH principles (Microservices, API-first, Cloud-native, Headless), Event Driven. Skills to modernise architectural estates and drive serviceability, observability dashboarding and metrics in end products. Experience of Digital Transformation within either Java or Microsoft technologies landscape, Azure platform and .Net ecosystem. Expertise in Mobile and Web development frameworks and … languages like .Net, Java, Python Database technologies and platforms like SQL, NoSQL, Data Lake, Snowflake, Databricks, MongoDB, Oracle Frontend web development languages like React, Angular, JavaScript, HTML and CSS Observability platforms like Splunk, Dynatrace, Datadog, Grafana Integration technologies like REST, Kafka, iPaaS, API Management, ESB Awareness of placement of workloads on On-Prem Servers and Cloud (Azure/AWS/ More ❯
Posted:

Senior AWS Platform Engineer

Slough, England, United Kingdom
Hybrid / WFH Options
JR United Kingdom
and will help clients adopt modern DevOps practices with a strong emphasis on automation, self-service, and operational excellence. Tech You'll Use: Terraform & GitHub Actions CI/CD, observability tooling (Grafana, Prometheus), containerisation (Docker) What You'll Be Doing: Designing and implementing secure, resilient AWS infrastructure Building CI/CD pipelines and reusable deployment patterns Advising on cloud-native More ❯
Posted:
Observability
the UK excluding London
10th Percentile
£49,563
25th Percentile
£61,563
Median
£74,500
75th Percentile
£85,000
90th Percentile
£98,500