Observability Jobs in England

151 to 175 of 2,462 Observability Jobs in England

DevOps Engineer III

London, England, United Kingdom
Take-Two Interactive
to ask for help when needed. Strong proficiency in AWS, Azure, or GCP, with hands-on experience with Terraform. Experience with configuration management tools like Ansible or Puppet, and observability tools like Prometheus, Grafana, and Datadog, etc. Design, develop, automate testing, and deploy custom tools using languages like Python or C#. Version control administration (examples GitHub, Perforce) Great to Have More ❯
Posted:

Senior Data Engineer

London, United Kingdom
Hybrid / WFH Options
VivaCity
mentoring engineers and collaborating with stakeholders. Proven ability to resolve technical incidents in unfamiliar production systems. Technical and process documentation champion. Experience of operationally managing production software components, including observability, logging, metrics, error reporting, debugging, and live incident management. Your time will be spent roughly as follows: 60% - Proactive technical work (e.g. migrating DB hosting provider, new message bus system More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer

London, United Kingdom
P2P
FX or crypto trading; front-end experience with React or similar frameworks is a plus. Collaborate with the team to implement, configure, and manage comprehensive monitoring, logging, alerting, and observability solutions - advocating for security best practices. Deploy, manage, operate, and scale applications and services on AWS - whilst troubleshooting performance issues across the stack. Collaborative, agile approach, passionate about clean architecture More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Engineer Europe

London, England, United Kingdom
Hybrid / WFH Options
Smartcat Platform Inc
familiar with DevOps tools and processes. Confidently navigate through Platform Infrastructure. Day 60 Join the process of being on duty in a team, be able to analyze problems, use observability/monitoring tools and handle investigations. Support Production releases and address blockers of CI/CD process. Day 90 Complete two quarter deliverable in alignment with Outcomes. WHAT YOU’VE More ❯
Posted:

Lead Software Engineer

Bristol, England, United Kingdom
Hybrid / WFH Options
Lloyds Banking Group
collaboration skills, with the ability to influence and align diverse teams on a shared vision. Knowledge of DevOps practices and tools CI/CD pipelines. Knowledge of Monitoring and Observability tooling. In addition, any experience of these would be useful: Experience with data mesh concepts (e.g., domain-driven ownership, and data product thinking). Expertise in GCP services, including BigQuery More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

London, England, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Salford, England, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Senior MLOps/GenAI Infrastructure Engineer

Newcastle upon Tyne, England, United Kingdom
Hybrid / WFH Options
BBC Group and Public Services
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Posted:

Lead Devops

London, England, United Kingdom
Tata Consultancy Services
teams to build secure, scalable, and cost-efficient cloud solutions. You will be provided with access to cutting-edge cloud technologies, including AWS serverless computing, Kubernetes orchestration, AI-driven observability, and security automation, keeping you at the forefront of innovation. Your responsibilities: Implement and manage highly available, scalable, and secure applications hosted on AWS Cloud, leveraging multi-region deployment strategies More ❯
Posted:

Senior Platform Engineer, Observability

London, England, United Kingdom
Forter
Join to apply for the Senior Platform Engineer, Observability role at Forter 1 week ago Be among the first 25 applicants Join to apply for the Senior Platform Engineer, Observability role at Forter Get AI-powered advice on this job and more exclusive features. At Forter, you’ll have the chance to make a direct impact on the developer experience … across the company while working with cutting-edge observability technologies and practices. We value innovation , collaboration , and continuous improvement , and we can’t wait to see what you’ll bring to our team! About the role: At Forter, we are looking for a Senior Platform Engineer, Observability with a strong development background and hands-on experience in observability tooling (ELK … we observe and troubleshoot our systems. We efficiently handle TBs of o11y data per day with very few incidents. In this role, you will help shape the future of observability at Forter, building scalable monitoring systems, creating intuitive developer tools, and collaborating with cross-functional teams to ensure our systems are both highly reliable and easy to troubleshoot. We’re More ❯
Posted:

Technical Account Manager

Slough, England, United Kingdom
JR United Kingdom
/join with: ? London, UK | ? Full-time | ? Senior-Level I'm hiring for a Technical Account Manager on behalf of a high-growth SaaS company building a next-generation observability platform. Their technology helps engineering teams monitor, analyse, and act on their logs, metrics, traces, and security data — improving performance and cutting observability spend. This is a senior, customer-facing … technical role ideal for someone with a background in cloud infrastructure, observability tools, and DevOps. You’ll play a key role in onboarding, supporting, and expanding relationships with enterprise customers — from hands-on implementation to strategic advisory. ? What You’ll Be Doing: Own the technical onboarding journey for new customers — from data integration to configuration and enablement. Work closely with … DevOps, SREs, and engineering teams to understand requirements and deliver high-impact observability solutions. Troubleshoot complex infrastructure issues (Kubernetes, Docker, pipelines, etc.) and advise on best practices. Act as a trusted technical advisor , providing guidance on implementation, optimisation, and long-term success. Partner with sales and customer success teams on renewals, expansions, and QBRs. What You Bring: Strong hands-on More ❯
Posted:

Technical Account Manager

London, United Kingdom
Coralogix, inc
us on our journey to revolutionize observability. In 2023, Dun & Bradstreet ranked Coralogix as one of the best tech startups to work for. Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of … logs, metrics, trace and security events with features such as APM, RUM, SIEM, Kubernetes monitoring and more, all enhancing operational efficiency and reducing observability spend by up to 70%. Technical Account Managers in Coralogix are key in our effort to meet our customer's expectations and help them utilize their observability and security data in the most efficient way … looking for hard-working, sharp, and humble professionals with proven technical customer-facing experience. Our Technical Account Managers are trusted advisors and consult our customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

SR Site Reliability Engineer

Mablethorpe, England, United Kingdom
Wakapi
systems using load balancing, auto-scaling, canary releases, and blue-green deployments. Develop and maintain monitoring and logging dashboards with tools like New Relic, Prometheus, Grafana, and Datadog, ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance requirements. Monitor, optimize, and ensure system reliability and performance … like Terraform. Strong understanding of scalability, high availability patterns, and DevOps metrics such as DORA. Knowledge of SLM metrics (SLAs, SLOs, SLIs) and their application. Experience with monitoring and observability tools like New Relic, Prometheus, Grafana, and Datadog. Experience working with Kafka and improving performance in event-driven, real-time data architectures. Familiarity with cloud providers like AWS, Azure, or … GCP. Experience with CI/CD tools such as GitHub Actions, Jenkins, or GitLab CI. Strong analytical and communication skills. Nice-to-haves Familiarity with Observability-as-Code tooling and practices. Knowledge of Chaos Engineering practices. Senior Level: Mid-Senior, Employment: Full-time, Industry: Software Development #J-18808-Ljbffr More ❯
Posted:

Technical Account Manager - DevOps Specialist

City of London, London, United Kingdom
ITR Partners
Technical Account Manager - DevOps Specialist London - Hybrid (2 days per week in office) · Full-time · Senior About the company My client are rebuilding the path to observability using a real-time streaming analytics pipeline that provides monitoring, visualization, and alerting capabilities without the burden of indexing. By enabling users to define different data pipelines per use case, we provide deep … Observability and Security insights, at an infinite scale, for less than half the cost. About the Position Technical Account Managers in my client are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way possible. We are looking for hard-working, sharp, and humble professionals with … proven technical customer-facing experience. Their Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and accurately solve problems, show product More ❯
Posted:

Technical Account Manager - DevOps Specialist

London Area, United Kingdom
ITR Partners
Technical Account Manager - DevOps Specialist London - Hybrid (2 days per week in office) · Full-time · Senior About the company My client are rebuilding the path to observability using a real-time streaming analytics pipeline that provides monitoring, visualization, and alerting capabilities without the burden of indexing. By enabling users to define different data pipelines per use case, we provide deep … Observability and Security insights, at an infinite scale, for less than half the cost. About the Position Technical Account Managers in my client are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way possible. We are looking for hard-working, sharp, and humble professionals with … proven technical customer-facing experience. Their Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and accurately solve problems, show product More ❯
Posted:

Technical Account Manager - DevOps Specialist

Slough, England, United Kingdom
JR United Kingdom
wide Job Description: Technical Account Manager - DevOps Specialist London - Hybrid (2 days per week in office) · Full-time · Senior About the company My client are rebuilding the path to observability using a real-time streaming analytics pipeline that provides monitoring, visualization, and alerting capabilities without the burden of indexing. By enabling users to define different data pipelines per use case … we provide deep Observability and Security insights, at an infinite scale, for less than half the cost. About the Position Technical Account Managers in my client are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way possible. We are looking for hard-working, sharp, and … humble professionals with proven technical customer-facing experience. Their Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and accurately solve More ❯
Posted:

Technical Account Manager

London, England, United Kingdom
Coralogix, inc
us on our journey to revolutionize observability. In 2023, Dun & Bradstreet ranked Coralogix as one of the best tech startups to work for. Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of … logs, metrics, trace and security events with features such as APM, RUM, SIEM, Kubernetes monitoring and more, all enhancing operational efficiency and reducing observability spend by up to 70%. Technical Account Managers in Coralogix are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way … looking for hard-working, sharp, and humble professionals with proven technical customer-facing experience. Our Technical Account Managers are trusted advisors and consult our customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are More ❯
Posted:

DevOps Engineer - AWS

London Area, United Kingdom
Hybrid / WFH Options
Cognitive Group | Part of the Focus Cloud Group
and service incidents with root cause analysis and preventive measures. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. Manage and optimize CI/CD pipelines for automated testing, deployment, and rollback strategies. … AWS services at the DevOps Engineer level Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/ More ❯
Posted:

DevOps Engineer - AWS

City of London, London, United Kingdom
Hybrid / WFH Options
Cognitive Group | Part of the Focus Cloud Group
and service incidents with root cause analysis and preventive measures. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. Manage and optimize CI/CD pipelines for automated testing, deployment, and rollback strategies. … AWS services at the DevOps Engineer level Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/ More ❯
Posted:

DevOps Engineer - AWS

South East London, England, United Kingdom
Hybrid / WFH Options
Cognitive Group | Part of the Focus Cloud Group
and service incidents with root cause analysis and preventive measures. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. Manage and optimize CI/CD pipelines for automated testing, deployment, and rollback strategies. … AWS services at the DevOps Engineer level Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/ More ❯
Posted:

Platform Engineer

London, England, United Kingdom
Hybrid / WFH Options
CATCHES
and CI/CD capabilities RESPONSIBILITIES Orchestrate and maintain our Baremetal and GCP infrastructure. Implement infrastructure-as-code (Terraform) and automated release workflows that enable true continuous delivery. Drive observability: log aggregation, metrics, distributed tracing and on-call runbooks. Champion security, cost-efficiency and performance tuning across our services. Collaborate with product and platform teams to ship end-to-end … features and migrations. REQUIREMENTS Extensive experience orchestrating infrastructure at scale across cloud and baremetal. SRE & Kubernetes expertise (GKE/AKS/EKS) and container-native observability stacks (Datadog/Prometheus/Grafana). Proven ownership of CI/CD pipelines (GitHub Actions, Cloud Build, Azure DevOps, etc.) and release automation. Proven experience with multiplatform scripting languages (Python, bash, PowerShell). … TECH STACK Cloud: GCP (primary), Azure (minimal) Languages: Terraform, Python, Bash, Powershell Databases: PostgreSQL, Redis, BigQuery Messaging: Pub/Sub, RabbitMQ Infra & Ops: Docker, Kubernetes, Terraform, GitHub Actions, Proxmox Observability: OpenTelemetry, Datadog More ❯
Posted:

Site Reliability Engineer

City of London, England, United Kingdom
Whitehall Resources Ltd
in managing cloud infrastructure, ensuring the reliability of production systems, and improving end-to-end deployment pipelines. This role combines deep operational responsibilities with a strong focus on automation, observability, and continuous improvement. You will be responsible for maintaining high system availability, enabling rapid delivery through CI/CD, and supporting development teams with robust infrastructure and tooling. A key … incidents with root cause analysis and preventive measures. 3. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. 4. Implement and maintain observability solutions using Prometheus, Grafana, and Splunk. 5. Write PromQL queries for custom monitoring dashboards, alerting, and diagnostics. 6. Manage and optimize CI/CD pipelines for automated testing, deployment, and … at the DevOps Engineer level 2. Incident, change & problem management experience. This role is heavily operation-oriented, including on-call requirements 3. Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL 4. Proficient in one or more languages of Python, Go, Bash, SQL 5. Familiar with GitHub/GitOps/container More ❯
Posted:

Site Reliability Engineer

Reigate, England, United Kingdom
Hybrid / WFH Options
Willis Towers Watson
Engineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms. You will be joining Insurance Consulting and Technology (ICT) at an exciting time of transformation as we work on improving the delivery of value for customers and the business. … office up to two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing services Maintain and configure observability platforms such as Datadog Proactive monitoring of production and other environments to ensure stability, availability, security and integrity Design and implement automation and processes to improve the efficiency and effectiveness … DevOps Experience of running 24x7 services in a public cloud, ideally Azure Deep understanding of cloud infrastructure and services, including best practices for monitoring, scaling, and security Experience with observability platforms such as Datadog or similar tools Strong interpersonal skills, with the ability to work effectively with many stakeholders Solid verbal and written communication skills, and the ability to present More ❯
Posted:

Senior Storage Engineer

City of London, London, United Kingdom
NJF Global Holdings Ltd
scale, data-intensive workloads Implement and maintain DevOps tooling (Terraform, Ansible, GitLab CI/CD, Jenkins) Lead PoCs for new storage technologies and present results to technical leadership Support observability via Grafana, Prometheus, Splunk , and related platforms Contribute to containerization efforts with Docker and Kubernetes (preferred) What We’re Looking For: 8+ years of experience in storage systems administration and … kernel bypass) Strong understanding of Linux performance tuning , particularly in HPC or ML/AI contexts Programming/scripting experience in Python , Golang , or similar languages Familiarity with modern observability and monitoring tools (Grafana, Prometheus, Splunk) Experience supporting AI/ML modelling environments is highly desirable Knowledge of container and orchestration technologies (Docker, Kubernetes) is a plus Proactive, collaborative, and More ❯
Posted:

Senior Storage Engineer

London Area, United Kingdom
NJF Global Holdings Ltd
scale, data-intensive workloads Implement and maintain DevOps tooling (Terraform, Ansible, GitLab CI/CD, Jenkins) Lead PoCs for new storage technologies and present results to technical leadership Support observability via Grafana, Prometheus, Splunk , and related platforms Contribute to containerization efforts with Docker and Kubernetes (preferred) What We’re Looking For: 8+ years of experience in storage systems administration and … kernel bypass) Strong understanding of Linux performance tuning , particularly in HPC or ML/AI contexts Programming/scripting experience in Python , Golang , or similar languages Familiarity with modern observability and monitoring tools (Grafana, Prometheus, Splunk) Experience supporting AI/ML modelling environments is highly desirable Knowledge of container and orchestration technologies (Docker, Kubernetes) is a plus Proactive, collaborative, and More ❯
Posted:
Observability
England
10th Percentile
£57,500
25th Percentile
£65,000
Median
£80,000
75th Percentile
£97,500
90th Percentile
£117,500