FastAPI and SQLAlchemy for building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries More ❯
FastAPI and SQLAlchemy for building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring ****and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries More ❯
Responsibilities: Architect and implement scalable, secure Kubernetes-based infrastructure for multi-cloud and hybrid environments. Lead technical direction for core Fleet initiatives-control plane services, tenancy models, deployment pipelines, observability layers, and more. Mentor engineers across the team, fostering a strong engineering culture of ownership, curiosity, and excellence. Drive modernization efforts-introducing patterns like GitOps, Policy-as-Code (Kyverno), Cilium More ❯
Infrastructure Observability Engineer - Leading Trading Company Location: London, UK Contract Type: Permanent Salary: Competitive + Benefits About Our Client Our client is a well-established trading company with a strong presence in the global commodities market. They are committed to leveraging cutting-edge technology solutions to drive operational excellence and maintain their competitive edge in the fast-paced trading environment. … The Role We are seeking an experienced Infrastructure Observability Engineer to lead the design, implementation, and continuous improvement of our client's enterprise observability platform. This role focuses on delivering comprehensive monitoring, event correlation, and impact analysis, demonstrating AIOps capabilities and tools such as BMC Helix Operations Manager. The ideal candidate will be passionate about improving access to infrastructure performance … automating operational intelligence, and reducing mean time to resolution (MTTR) through intelligent alerting and root cause analysis. Key Responsibilities Own and evolve the enterprise observability strategy across all infrastructure tracks Design, implement, and support event management and impact analysis workflows using platforms such as BMC Helix Operations Manager Integrate and correlate data from multiple sources (e.g., 20+ monitoring systems) into More ❯
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
interAct Consulting Limited
as-Code (IaC). Experience of Configuration-as-Code, Containerisation and Orchestration, CI/CD. Proficiency with Kubernetes, Docker and AKS. Familiarity with Azure cloud-native services. Knowledge of observability and site-reliability engineering principles. Proficiency in SQL and experience working with relational databases. This is a fully remote (UN only) position within a fabulous team. Lots of flexibility, opportunity More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
interAct Consulting Limited
as-Code (IaC). Experience of Configuration-as-Code, Containerisation and Orchestration, CI/CD. Proficiency with Kubernetes, Docker and AKS. Familiarity with Azure cloud-native services. Knowledge of observability and site-reliability engineering principles. Proficiency in SQL and experience working with relational databases. This is a fully remote (UK only) position within a fabulous team. Lots of flexibility, opportunity More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Strive Gaming
in between - ensuring our platform is resilient, efficient, secure and developer-friendly. Key Responsibilities: Design, build, and maintain platform services and infrastructure used by product engineering teams. Improve reliability, observability, and scalability of existing systems. Develop and maintain CI/CD pipelines to support software delivery. Build tooling and automation that supports self-service infrastructure and deployment. Ensure security best More ❯
move? Get in touch and apply today! Responsibilities: Respond rapidly to critical AWS incidents, identify root causes, and deploy automated hotfixes. Lead the setup and integration of Prometheus-Grafana observability stack. Refactor and modernize deployment pipelines using GitHub Actions and Kubernetes. Maintain robust monitoring, alerting, and CI/CD systems. Skills/Must have: Strong hands-on experience with AWS More ❯
years in platform/SRE/DevOps roles * Strong Kubernetes experience (config and deployment) * Deep CI/CD experience - Jenkins, GitLab CI/CD or similar * Skilled with infra observability tooling (Prometheus, Grafana, etc.) * Confident with Git and repo management workflows * Strong automation mindset - reducing manual intervention wherever possible * Cloud experience (AWS, Azure or GCP) * Must be a sole UK More ❯
Skills Java 17+ and Spring Boot microservices REST API development SQL and data modelling (ideally PostgreSQL) AWS and OpenShift CI/CD and modern DevOps practices Cloud monitoring and observability tools Infrastructure as CodeDesirable Skills Front-end development with React, HTML/CSS, and accessibility awareness AWS services such as Lambda, S3, Aurora, API Gateway, CDK, CloudFormation Testing frameworks like More ❯
Skills Java 17+ and Spring Boot microservices REST API development SQL and data modelling (ideally PostgreSQL) AWS and OpenShift CI/CD and modern DevOps practices Cloud monitoring and observability tools Infrastructure as Code Desirable Skills Front-end development with React, HTML/CSS, and accessibility awareness AWS services such as Lambda, S3, Aurora, API Gateway, CDK, CloudFormation Testing frameworks More ❯
innovation within the team. Desirable Exposure to AWS AI services (e.g., Lex, Bedrock). Experience with serverless architectures and event-driven design patterns. Familiarity with containerization (Docker, ECS) and observability tooling. Team Fit A proactive mindset with a passion for mentoring and uplifting team performance. Strong communication skills and the ability to work collaboratively across distributed teams. A drive to More ❯
innovation within the team. Desirable Exposure to AWS AI services (e.g., Lex, Bedrock). Experience with serverless architectures and event-driven design patterns. Familiarity with containerization (Docker, ECS) and observability tooling. Team Fit A proactive mindset with a passion for mentoring and uplifting team performance. Strong communication skills and the ability to work collaboratively across distributed teams. A drive to More ❯
HIPAA). Background in customer-centric or product-driven industries such as digital , eCommerce , or SaaS . Experience with infrastructure-as-code tools like Terraform and expertise in data observability and monitoring practices. More ❯
innovation within the team. Desirable: Exposure to AWS AI services (e.g., Lex, Bedrock). Experience with serverless architectures and event-driven design patterns. Familiarity with containerisation (Docker, ECS) and observability tooling. Team Fit: A proactive mindset with a passion for mentoring and uplifting team performance. Strong communication skills and the ability to work collaboratively across distributed teams. A drive to More ❯
Liverpool, Merseyside, England, United Kingdom Hybrid / WFH Options
Broster Buchanan Ltd
scalability and resilience in applications handling large volumes of traffic and burst events. Work collaboratively with cross-functional teams, including DevOps, Infrastructure, and Product, to deliver robust systems. Leverage observability tools to monitor, alert, and troubleshoot application and integration health. Stay current on AI-driven software development practices (e.g., GPT-assisted development, Agentic AI workflows) and suggest practical implementations. Participate More ❯
architectures across Azure, AWS, and Google Cloud Leading platform engineering squads using DevSecOps, Kubernetes, and automation tooling Enabling edge and private cloud capabilities (e.g., Azure Stack, AWS Outposts) Implementing observability and governance tooling to support modern operations Supporting Agile and product-based delivery using SRE, CI/CD, and Infrastructure as Code Advising clients on architecture optimisation, security, cost control More ❯
automation, scalability, and high reliability. A strong working knowledge of Microsoft Azure is essential. The role involves daily coding, technical leadership across orchestration, CI/CD pipelines, cloud services, observability, and security-working alongside site reliability, onboarding, architecture, and delivery functions. You're expected to scale impact through others by upskilling team members, hiring where needed, and championing platform engineering More ❯
Fi authentication systems, CRMs and partnered PropTech tools Continually hone and perfect our homegrown DevOps and CI/CD processes by further developing GitHub Actions pipelines, Terraform definitions and observability integrations. Ensure quality & reliability: establish testing best practices (unit, integration, end-to-end), conduct code reviews and demand high quality standards Shape and refine our cloud-native platform to optimise More ❯
workflows. Implement robust monitoring, alerting, and incident response processes to maintain high levels of system reliability and uptime. Continuously assess and integrate new tools and technologies to enhance automation, observability, and scalability. Drive platform automation across provisioning, deployments, security controls, and operational workflows Proven experience in a DevOps or platform engineering role, ideally within a fast-paced or regulated environment. More ❯
Fi authentication systems, CRMs and partnered PropTech tools Continually hone and perfect our homegrown DevOps and CI/CD processes by further developing GitHub Actions pipelines, Terraform definitions and observability integrations. Ensure quality & reliability: establish testing best practices (unit, integration, end-to-end), conduct code reviews and demand high quality standards Shape and refine our cloud-native platform to optimise More ❯
Employment Type: Permanent
Salary: £80000 - £85000/annum Plus Bonus and Benefits
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
day-to-day and strategic decision making.You will be a hands-on and customer focused engineering servant-leader. You will be comfortable moving across orchestration, automation, pipelines, cloud services, observability and security domains (even if you are not an expert in them all). A non-negotiable is experience and familiarity with Microsoft Azure.You will play your part in operating More ❯
chains, for both internal and external use This involves work across various disciplines, with a tight focus on our specific areas of responsibility, such as cloud provisioning, infrastructure management, observability, and CI/CD You will be responsible for building and maintaining various tools, solutions and services associated with these areas Taking ownership where needed. We've no shortage of More ❯