Grays, England, United Kingdom Hybrid / WFH Options
TES
microservices design patterns and deployment strategies in a cloud-native environment. Security Best Practices: Strong understanding of security frameworks and compliance standards for cloud infrastructure and DevOps processes. Monitoring & Observability: Understanding of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK) to ensure system performance and issue tracking. Skills CI/CD Tools: Hands-on experience with Jenkins, GitLab CI More ❯
London, England, United Kingdom Hybrid / WFH Options
Bohemian Rhapsody Silver
microservices design patterns and deployment strategies in a cloud-native environment. Security Best Practices: Strong understanding of security frameworks and compliance standards for cloud infrastructure and DevOps processes. Monitoring & Observability: Understanding of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK) to ensure system performance and issue tracking. Skills CI/CD Tools: Hands-on experience with Jenkins, GitLab CI More ❯
Join or sign in to find your next job Join to apply for the Senior DevOps Engineer - Monitoring & Observability role at Lumenalta As a Senior DevOps Engineer at Lumenalta, you will be pivotal in architecting and managing cloud-based systems on AWS, implementing CI/CD pipelines, and automating infrastructure deployment using tools like Terraform and AWS CDK. You will … to automate application builds, testing, and deployments. Infrastructure as Code (IaC): Use Terraform, AWS CDK, or CloudFormation to automate cloud resource provisioning, enabling consistent and repeatable infrastructure deployments. Monitoring & Observability: Implement monitoring, logging, and alerting solutions using tools like Prometheus, Grafana, Loki, Datadog, or CloudWatch to ensure system health and performance. Security & Compliance: Implement security best practices for cloud infrastructure More ❯
London, England, United Kingdom Hybrid / WFH Options
Quaisr Limited
such as Kubernetes, Docker Swarm, or HashiCorp Nomad. Excellent problem-solving, communication, and collaboration skills. Nice to have: Experience managing distributed systems, microservices, and event-driven architectures. Knowledge of observability tools such as Prometheus, Grafana, ELK Stack, or Datadog. Experience with security best practices, monitoring, and incident response. Familiarity with DevSecOps and compliance frameworks (ISO 27001, SOC 2, GDPR). More ❯
such as Kubernetes, Docker Swarm, or HashiCorp Nomad. Excellent problem-solving, communication, and collaboration skills. Nice to have: Experience managing distributed systems, microservices, and event-driven architectures. Knowledge of observability tools such as Prometheus, Grafana, ELK Stack, or Datadog. Experience with security best practices, monitoring, and incident response. Familiarity with DevSecOps and compliance frameworks (ISO 27001, SOC 2, GDPR). More ❯
ArgoCD and Helm). Experience in migrating monolithic applications into microservices architectures. In-depth Linux/Unix experience, emphasizing system performance tuning and automation. Familiarity with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, Loki, OTel, ELK stack) to ensure system reliability and performance. Experience in developing and working with backend applications technologies (e.g. Express, Django). Benefits we offer More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
Sportserve
to GitLab-CI) Maintain high standards of platform reliability, security, capacity, and performance Support hiring, onboarding, and knowledge sharing Assist in incident management and operational excellence Promote cost awareness, observability, and performance optimization Enhance team skills through knowledge sharing and setting quality benchmarks Advocate for non-functional requirements like monitoring, alerting, logging Requirements 8+ years in senior or lead roles More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
The Acorn Group
with GitOps tools (e.g., ArgoCD, Flux). CI/CD - Skilled in building and managing pipelines using Azure DevOps, GitHub Actions, etc. Monitoring - Experience with Prometheus, Grafana, and other observability tools. Application Stack - Familiarity with .NET, Node.js, React, and web server technologies like Nginx. Relevant certifications or the ability to demonstrate equivalent experience, such as: Terraform Associate About Acorn Insurance More ❯
GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the worlds most well-known organisations. Youll play a key role in helping our customers achieve greater … visibility, performance, and reliability across their IT estatescontributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms … with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring & Observability SME within customer delivery teams Support incident response activities and postmortems by identifying patterns, root causes, and optimisation opportunities Work collaboratively with cross-functional teams to define and implement best practices in observability and monitoring Attend customer and More ❯
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Python (or other language), Bash/Shell, YAML including any Development frameworks Extensive experience and in-depth knowledge of the Linux operating system for effective troubleshooting activities Experience with Observability tools like Grafana, Prometheus, ELK, OCI Observability We highly value ownership and initiative with capabilities to drive projects independently Dealing with changes on a daily basis in a very dynamic More ❯
to architect secure, performant, and highly available cloud solutions. Proficiency with monitoring and log analytics tools such as AWS CloudWatch, ELK Stack, Prometheus, Datadog, or New Relic, to maintain observability and ensure operational excellence. Demonstrated leadership skills in managing complex, high-pressure situations and guiding teams through incident resolution. Exceptional communication and presentation skills, with proven experience engaging with senior … to architect secure, performant, and highly available cloud solutions. Proficiency with monitoring and log analytics tools such as AWS CloudWatch, ELK Stack, Prometheus, Datadog, or New Relic, to maintain observability and ensure operational excellence. Demonstrated leadership skills in managing complex, high-pressure situations and guiding teams through incident resolution. Exceptional communication and presentation skills, with proven experience engaging with senior More ❯
is a plus) A strong interest in automation, infrastructure best practices, and continuous learning Good communication skills and a collaborative mindset Experience with Terraform, Ansible, or Helm Familiarity with observability tools such as ELK Stack, CloudWatch, or New Relic Understanding of security considerations in cloud and CI/CD environments #J-18808-Ljbffr More ❯
is a plus) A strong interest in automation, infrastructure best practices, and continuous learning Good communication skills and a collaborative mindset Experience with Terraform, Ansible, or Helm Familiarity with observability tools such as ELK Stack, CloudWatch, or New Relic Understanding of security considerations in cloud and CI/CD environments #J-18808-Ljbffr More ❯
one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
Ansible). • Proficiency in Linux/Unix systems, networking concepts, and security principles. • Familiarity with software development security tooling (e.g., SonarQube, Grype, JFrog Xray). • Knowledge of monitoring and observability tools like Prometheus, Grafana, and Elastic tack. • Knowledge of the ASD Essential 8 and ISM and how to apply the relevant controls. • Experience with database administration (e.g., PostgreSQL, Microsoft SQL More ❯
to maintain a CI build environment capable of running automation tests for effective feedback. Assist in designing, developing and implementing automation test frameworks. Develop and improve our monitoring and observability tooling. Coach and mentorteam matesto improve their own DevOps skills and experience Research emerging tools, trends and methodologies Assist in managing checked in source code from check-in through to More ❯
London, England, United Kingdom Hybrid / WFH Options
Global Screening Services
impact! About The Role This is an exciting opportunity to join our growing Operations team managing Kubernetes clusters in Production and, through a DevOps culture, empower development teams with observability insights they can use to innovate faster. We are looking for a Site Reliability Engineer, or production experienced DevOps Engineer, who has working experience building observability for cloud native SaaS … products and driving operational excellence. You will be responsible for delivering our monitoring infrastructure, shaping observability, and responding to incidents as well as ensuring the platform is performant and reliable. You will be a key member of the team, liaising with product teams, embedding SRE principles and building the observability platform for the next stage of growth at GSS. You … new features are maintainable, have well defined SLIs, achievable SLOs, are properly monitored, and evaluated for failure scenarios Enabling development teams through DevOps culture and the effective use of observability tools. Promote best practice, present KT sessions, help troubleshoot and resolve business affecting issues Building on our existing monitoring tools to deliver a comprehensive, optimised observability platform for logging, metrics More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration management … principles and hands-on experience with tools such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless More ❯
Hampshire, England, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration management … principles and hands-on experience with tools such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless More ❯
Eastbourne, England, United Kingdom Hybrid / WFH Options
AxisOps
and architecture through to production and operations. Our strength lies in software delivery, supported by deep expertise in platform engineering, built on an understanding of private cloud-native infrastructure, observability, and DevSecOps. Our culture We value sharp thinking, clear communication, and teams that look out for each other. At AxisOps, our core values are: Ingenuity – solving hard problems with elegant … runtimes is welcome but not required) Maintain and evolve microservice architecture built in Python and PHP, with deployment via GitLab CI/CD and runtime orchestration via Andromeda Deliver observability using Prometheus, Grafana, and the ELK stack, supporting metrics, logs, and alerting workflows Support and maintain internal ML infrastructure and pipelines , helping ensure that our AI and data workloads run … maintain standardised developer desktop environments , supporting our engineering team’s daily tooling and dev workflow Contribute to our IoT platform , including reliable edge infrastructure, secure messaging, and data flow observability Support and maintain our private datacentre , including rack-level hardware, networking, and server fleet resilience Continuously improve security posture , covering patching, firewall maintenance, secrets handling, and backup strategy Write markdown More ❯