Observability Jobs in England

26 to 50 of 2,505 Observability Jobs in England

Site Reliability Engineer

Hampshire, England, United Kingdom
Hybrid / WFH Options
Spectrum IT Recruitment
level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration management … principles and hands-on experience with tools such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless More ❯
Posted:

Site Reliability Engineer

London, England, United Kingdom
Hybrid / WFH Options
ZipRecruiter
level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration management … principles and hands-on experience with tools such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless More ❯
Posted:

Senior Product Manager, DevOps and Infrastructure Products

Stevenage, Hertfordshire, United Kingdom
GlaxoSmithKline
along with the Onyx portfolio management team, to deliver industry-leading DevOps and Infrastructure products that provide Infrastructure-as-code abstractions and operating principles, leading cloud computing capability, automation, observability, operability, and developer experience. You will drive the product roadmap, guide product development initiatives, and ensure the successful launch and adoption of DevOps and Infrastructure products. Together, you will facilitate … the following characteristics, it would be a plus: Strong understanding of modern infrastructure and site reliability engineering practice, including Infrastructure-as-code tools (e.g. Terraform, Ansible ) and metrics and observability tools (e.g. Prometheus, Grafana ). Strong understanding of modern DevOps practice, including DevOps stacks (e.g. Jenkins, GitLab, CircleCI ). Cloud experience (e.g. AWS, Google Cloud, Azure, Kubernetes). Familiar with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

DevOps Engineer

London, England, United Kingdom
Hybrid / WFH Options
Canada Life
infrastructure to the cloud and understanding the challenges involved Familiarity with cloud security best practices, identity and access management (IAM), and encryption techniques Microsoft Azure certifications are a plus Observability Designing, implementing and day-to-day use of logging and monitoring tools to capture data for alerting and issue identification and resolution using DataDog, App Insights or similar tools. Designing … applications and infrastructure for observability, security, and reliability. Networking & Security Monitor and enhance network performance, ensuring high levels of security and scalability across all cloud environments. Enforce security best practices in AKS, including network policies, RBAC (Role-Based Access Control), and integration with Azure Active Directory Core Services Software development experience, ideally in .NET stack. SQL skills to manage and More ❯
Posted:

DevOps/SRE Engineer

London, England, United Kingdom
Hybrid / WFH Options
Redefined Ltd
infrastructure to the cloud and understanding the challenges involved Familiarity with cloud security best practices, identity and access management (IAM), and encryption techniques Microsoft Azure certifications are a plus Observability Designing, implementing and day-to-day use of logging and monitoring tools to capture data for alerting and issue identification and resolution using DataDog, App Insights or similar tools. Designing … applications and infrastructure for observability, security, and reliability. Networking & Security Monitor and enhance network performance, ensuring high levels of security and scalability across all cloud environments. Enforce security best practices in AKS, including network policies, RBAC (Role-Based Access Control), and integration with Azure Active Directory Core Services Azure core services such as Azure Storage, including Blob, Azure VMs, Azure More ❯
Posted:

Senior Production Operations Engineer

London, England, United Kingdom
Index Exchange
emergency events outside of your local time-zone. Here's what you need: Technical Expertise In-depth understanding of the Linux operating environment: kernel tuning, network stack tuning, system observability & instrumentation, and security & access management. Solid understanding of layer 2-7 networking fundamentals and the relationship between servers & services, and the transit of their packets through network hardware. In-depth … experience engineering and maintaining a private-cloud infrastructure: Bare-metal, vSphere, KVM, Kubernetes. Experience with tools like Ansible, Terraform, Docker, Kafka, Nexus. Experience with observability platforms: InfluxDB, Prometheus, ELK, Jaeger, Grafana, Nagios, Zabbix. Familiarity with Big Data tools: Hadoop, HDFS, Spark, HBase. Ability to write code in Go, Python, Bash, or Perl for automation. Work Experience 6-8 years of More ❯
Posted:

Senior SRE

London, England, United Kingdom
Index Exchange
emergency events outside of your local time-zone. Here's What You Need Technical Expertise In-depth understanding of the Linux operating environment: kernel tuning, network stack tuning, system observability & instrumentation, and security & access management. Solid understanding of layer 2-7 networking fundamentals and the relationship between servers & services, and the transit of their packets through network hardware. In-depth … experience engineering and maintaining a private-cloud infrastructure: Bare-metal, vSphere, KVM, Kubernetes. Experience with tools like Ansible, Terraform, Docker, Kafka, Nexus Experiencing with observability platforms: InfluxDB, Prometheus, ELK, Jaeger, Grafana, Nagios, Zabbix Familiarity with Big Data tools: Hadoop, HDFS, Spark, HBase Ability to write code in Go, Python, Bash, or Perl for automation. Work Experience 6-8 years of More ❯
Posted:

Principle DevOps Engineer

London, England, United Kingdom
Devoteam
tools, including experience with some of the following tools: GitLab CI, GitHub Actions, Concourse CI, Jenkins X, TeamCity, Artifactory, etc.; Infrastructure provisioning (at least one of Terraform, Ansible, CloudFormation); Observability and Application monitoring (ELK stack, TICK stack, Grafana, Prometheus, New Relic, Datadog, etc.); Networking concepts - Bastion hosts, Reverse Proxies, Load Balancing, TLS, etc. Key Soft Skills required: Naturally resilient, tenacious More ❯
Posted:

Senior DevOps Engineer

Manchester, Lancashire, United Kingdom
Hybrid / WFH Options
Arm Limited
infrastructure "Nice To Have" Skills and Experience: Experience in a GitOps solution such as ArgoCD, Flux or Fleet Implementation of the Security Development Lifecycle (SDL) in infrastructure Monitoring and observability using Prometheus and Grafana, ELK stack or equivalent Use of Kubernetes management systems such as Rancher Familiarity with open source project development cycles and contribution processes, particularly around CI/ More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior DevOps Engineer (SC Cleared)

London Area, United Kingdom
Hybrid / WFH Options
Amber Labs
working in Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset More ❯
Posted:

Senior DevOps Engineer (SC Cleared)

City of London, London, United Kingdom
Hybrid / WFH Options
Amber Labs
working in Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset More ❯
Posted:

Senior DevOps Engineer (SC Cleared)

South East London, England, United Kingdom
Hybrid / WFH Options
Amber Labs
working in Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset More ❯
Posted:

Platform Engineer

London, England, United Kingdom
Capgemini
ARM, or Pulumi. Experience in building secure applications and infrastructure. Strong communication skills, with the ability to convey and understand complex technical concepts clearly and concisely. SRE skills including observability and telemetry monitoring. Familiarity with the HashiCorp Suite (Packer, Terraform, Vault, Vagrant, Consul). Experience in containerization using Docker, Kubernetes, OpenShift, and Helm. Programming skills in languages such as Python More ❯
Posted:

Senior Infrastructure Engineer I

London, England, United Kingdom
Ethica Consulting
and firewalls. Experience with load balancers (F5, HAProxy, Nginx) and network monitoring tools. Experience in DNS management and troubleshooting. Experience in network security best practices. Proficiency in monitoring and observability tools (Prometheus, Grafana, Splunk). Proficiency in at least one scripting language (Python, Bash) for automation. Experience with CI/CD pipeline management and DevOps practices. Strong understanding of disaster More ❯
Posted:

Vice President, DevOps Engineer (NE)

London, England, United Kingdom
Hybrid / WFH Options
BlackRock, Inc
access to the best tools available. We combine problem-solving skills with software and systems engineering to take a proactive approach in building fault-tolerant and secure systems, improving observability and zealously automating away toil. In this role you will: Use your site reliability expertise to design, operate and support Preqin's infrastructure, middleware and internal services. Improving their performance More ❯
Posted:

Senior Machine Learning Ops Engineer

London, England, United Kingdom
DailyPay
and high availability CI/CD Pipeline Development: Develop and maintain robust CI/CD pipelines for continuous integration and deployment of ML models and related infrastructure Monitoring and Observability: Build and maintain comprehensive monitoring and alerting systems for our ML infrastructure and models, leveraging tools like DataDog to ensure system health and performance Collaboration and Mentorship: Collaborate effectively with More ❯
Posted:

Cloud Technical Architect / Data DevOps Engineer

Bristol, United Kingdom
Hewlett Packard Enterprise Development LP
etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or similar languages The following More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Manual Tester (DV Security Clearance)

Basingstoke, Hampshire, South East
CGI
Manual Tester (DV Security Clearance) Position Description CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent More ❯
Employment Type: Permanent
Posted:

Senior Platform Delivery Consultant IRC250319

London, United Kingdom
GlobalLogic
Consultant IRC250319 Job: IRC250319 Location: United Kingdom - London Designation: Senior Consultant Experience: 5-10 years Function: Engineering Skills: Cloud(Azure/AWS/GCP), Containers, DevOps Practices, Grafana, Kubernetes, Observability stack, SRE Management, Terraform Work Model: Hybrid We are seeking an experienced Platform Engineering leader with a hands-on engineering background, who can articulate the business benefits that Observability and … on the responsibility of handling client engagements from both technical and business perspectives. Requirements: We are ideally looking for someone with a strong background and experience in the following: Observability and SRE Practices: In-depth understanding of observability and Site Reliability Engineering practices. Familiarity with tools in the LGTM stack (Loki, Grafana, Tempo, Mimir) or equivalent observability platforms. Containerisation: Strong More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer - AWS Kubernetes

London, England, United Kingdom
Source Technology
and firewalls. • Experience with load balancers (F5, HAProxy, Nginx) and network monitoring tools. Experience in DNS management and troubleshooting. Experience in network security best practices. Proficiency in monitoring and observability tools (Prometheus, Grafana, Splunk). Proficiency in at least one scripting language (Python, Bash) for automation. Experience with CI/CD pipeline management and DevOps practices. Strong understanding of disaster … performance. Experience in tools like df, du, lsblk, and fdisk for managing and troubleshooting file systems and disk partitions. Familiarity with tools like Prometheus and Grafana for monitoring and observability Seniority level Seniority level Not Applicable Employment type Employment type Full-time Job function Job function Information Technology Industries Computer and Network Security Referrals increase your chances of interviewing at More ❯
Posted:

Site Reliability Engineer, Lead

London, England, United Kingdom
Hybrid / WFH Options
Mistral AI
computing and highly available distributed systems • Exposure to site reliability issues in critical environments (issue root cause analysis, in-production troubleshooting, on-call rotations...) • Experience working against reliability KPIs (observability, alerting, SLAs) • Hands-on experience with CI/CD, containerization and orchestration tools (Docker, Kubernetes...), monitoring, logging, alerting and observability tools (Prometheus, Grafana, ELK Stack, Datadog...), infrastructure-as-code tools More ❯
Posted:

Staff SRE

London, United Kingdom
Index Exchange
emergency events outside of your local time-zone. Here's what you need: Technical Expertise In-depth understanding of the Linux operating environment: kernel tuning, network stack tuning, system observability & instrumentation, and security & access management. Solid understanding of layer 2-7 networking fundamentals and the relationship between servers & services, and the transit of their packets through network hardware. In-depth … experience engineering and maintaining a private-cloud infrastructure: Bare-metal, vSphere, KVM, Kubernetes. Experience with tools like Ansible, Terraform, Docker, Kafka, Nexus Experience with observability platforms: InfluxDB, Prometheus, ELK, Jaeger, Grafana, Nagios, Zabbix Familiarity with Big Data tools: Hadoop, HDFS, Spark, HBase Ability to write code in Go, Python, Bash, or Perl for automation. Work Experience 5-7+ years More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead FX Trading Platform Specialist (DIR)

London, England, United Kingdom
London, United Kingdom
on industry trends and emerging technologies related to Forex trading and banking Responsibilities: Architecture & Design • Define the target micro service/event driven architecture (latency budget, throughput, HA & DR, observability). • Own protocol & data model standards for client to venue and internal flows (eg FIX 4.4/5.0, REST/gRPC, protobuf, JSON) Hands on Engineering • Lead development of price … binary encoded protocols (SBE, FAST), RTDS, gRPC, KAFKA, https api, udp • Pipeline: github, Jenkins, TeamCity, Sonar, XLDeploy, Docker, Kubernetes • Infra as code: Terraform, ansible, azure cloud • Datastores: PostGre, OCP • Observability: ELK, Grafana, OpenTelemetry • Batch: airflow (python) • Security & Compliance: TLS, OAuth2/OIDC, data masking, GDPR/MiFID controls • Project & Process: Scrum/Kanban, backlog grooming, metrics driven retrospectives Why join More ❯
Posted:

Lead DevOps Engineer

Leeds, England, United Kingdom
Hybrid / WFH Options
ZipRecruiter
platform modernisation Mentor and lead a small team of engineers Align DevOps capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Lead DevOps Engineer Requirements Proven technical and some leader/mentoring experience Cloud- expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI More ❯
Posted:

Lead DevOps Engineer

Bradford, England, United Kingdom
Hybrid / WFH Options
JR United Kingdom
platform modernisation Mentor and lead a small team of engineers Align DevOps capabilities with the wider business Champion DevEx, reliability, and security Embed operational excellence and incident response Promote observability and performance optimisation Proven technical and some leader/mentoring experience Cloud-native expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform More ❯
Posted:
Observability
England
10th Percentile
£57,500
25th Percentile
£65,000
Median
£77,500
75th Percentile
£97,500
90th Percentile
£117,500