our enterprise messaging infrastructure, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, network optimization, and system observability using industry-standard monitoring tools. Required Skills & Qualifications: 3+ years of experience administering enterprise-grade messaging systems. Strong background in production support, preferably in a 24x7 enterprise environment. Experience working More ❯
Wokingham, Berkshire, United Kingdom Hybrid / WFH Options
Nordcloud
to L3 networking Programming languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Must have experience with either Kubernetes or OpenShift Hosting technologies such as IIS, nginx, Apache, App Service, LightSail Analytical and creative approach to problem solving More ❯
Reading, Berkshire, South East, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
South East London, London, United Kingdom Hybrid / WFH Options
TEN10 SOLUTIONS LIMITED
stakeholder management skills. Nice-to-Have: Hands-on experience with Databricks , Apache Spark , and Azure Deequ . Familiarity with Big Data tools and distributed data processing. Experience with data observability and data quality monitoring. Proficiency with CI/CD tools like Jenkins, Azure DevOps, or GitLab CI. Previous consultancy or client-facing experience. Additional languages like SQL, TypeScript, or Bash More ❯
S3, Aurora, and ElasticCache Proficient with CI/CD tools such as GitLab , GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom
Noir
financial institution with soaring profits - my client is modernising platforms, embracing AI, and driving automation at scale. We're hiring a Lead Site Reliability Engineer (SRE) to drive reliability, observability, and performance across our Azure cloud infrastructure. You'll work in a modern engineering environment where we live by "you build it, you run it", focused on automation, scale, and More ❯
Bracknell, Berkshire, South East, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
Maidenhead, Berkshire, United Kingdom Hybrid / WFH Options
dynaTrace software GmbH
Docker, Kubernetes etc. Ideal candidates will have 2+ years of Dynatrace Technology experience Dynatrace Product Certification. Why you will love being a Dynatracer Dynatrace is a leader in unified observability and security. We provide a culture of excellence with competitive compensation packages designed to recognize and reward performance. Our employees work with the largest cloud providers, including AWS, Microsoft, and More ❯
Infrastructure Observability Engineer - Leading Trading Company Location: London, UK Contract Type: Permanent Salary: Competitive + Benefits About Our Client Our client is a well-established trading company with a strong presence in the global commodities market. They are committed to leveraging cutting-edge technology solutions to drive operational excellence and maintain their competitive edge in the fast-paced trading environment. … The Role We are seeking an experienced Infrastructure Observability Engineer to lead the design, implementation, and continuous improvement of our client's enterprise observability platform. This role focuses on delivering comprehensive monitoring, event correlation, and impact analysis, demonstrating AIOps capabilities and tools such as BMC Helix Operations Manager. The ideal candidate will be passionate about improving access to infrastructure performance … automating operational intelligence, and reducing mean time to resolution (MTTR) through intelligent alerting and root cause analysis. Key Responsibilities Own and evolve the enterprise observability strategy across all infrastructure tracks Design, implement, and support event management and impact analysis workflows using platforms such as BMC Helix Operations Manager Integrate and correlate data from multiple sources (e.g., 20+ monitoring systems) into More ❯
years in platform/SRE/DevOps roles * Strong Kubernetes experience (config and deployment) * Deep CI/CD experience - Jenkins, GitLab CI/CD or similar * Skilled with infra observability tooling (Prometheus, Grafana, etc.) * Confident with Git and repo management workflows * Strong automation mindset - reducing manual intervention wherever possible * Cloud experience (AWS, Azure or GCP) * Must be a sole UK More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Become
collaboration skills across multidisciplinary teams Desirable Attributes Exposure to microservices architecture and event-driven systems (e.g., Kafka) Experience with design systems and component libraries (e.g., Material, Storybook) Familiarity with observability tools and performance tuning Prior consulting experience or experience in client-facing roles Engagement Model Outside IR35 12-month initial contract with potential for extension or permanent employment Hybrid working More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
day-to-day and strategic decision making.You will be a hands-on and customer focused engineering servant-leader. You will be comfortable moving across orchestration, automation, pipelines, cloud services, observability and security domains (even if you are not an expert in them all). A non-negotiable is experience and familiarity with Microsoft Azure.You will play your part in operating More ❯
GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the worlds most well-known organisations. Youll play a key role in helping our customers achieve greater … visibility, performance, and reliability across their IT estatescontributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms … with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring & Observability SME within customer delivery teams Support incident response activities and postmortems by identifying patterns, root causes, and optimisation opportunities Work collaboratively with cross-functional teams to define and implement best practices in observability and monitoring Attend customer and More ❯
secure, scalable cloud data solutions, aligning with business and compliance needs. Key Responsibilities Design, build, and maintain cloud-native data pipelines (Azure/GCP) Implement scalable data management frameworks: observability, validation, lineage Translate business needs into effective technical prototypes and solutions Collaborate with stakeholders, data teams, and service partners Ensure data security, governance, and regulatory compliance Monitor and optimise cloud More ❯
Reigate, Surrey, England, United Kingdom Hybrid / WFH Options
Client Server Ltd
of IaC principles and tools such as Terraform and Pulumi You have experience of building and improving CI/CD pipelines for product teams You have experience with cloud observability (logging, tracing, metrics, monitoring and alerting) You have experience with Containerisation - Azure Container Apps preferred You have strong scripting skills with PowerShell and/or C# .Net coding You enjoy More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Method Resourcing
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Pontoon
support) What You Bring: Strong Java (streams, lambdas, concurrency) and front-end skills with React.js Deep knowledge of multithreaded, distributed systems and asynchronous architecture Experience with JVM tuning and observability tools (Prometheus, Elastic, etc.) TDD, CI/CD, and agile delivery experience Ability to deliver from design to deployment Bonus Points: Experience in Front Office, Risk, or Pricing within investment More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
WüNDER TALENT
with third-party APIs to support real-time marketing insights. Collaborate closely with cross-functional teams including Data Science, Software Engineering and Product. Champion best practices in data governance, observability and compliance. Contribute to CI/CD pipeline development and infrastructure automation (Terraform, AWS DevOps). Provide input into technical decisions, peer reviews and solution design. Requirements Proven experience as More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
and resolve application-level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
Electronic Arts
Source control management tools (e.g. Perforce, Git) Configuration management tools (e.g. Chef, Ansible, Terraform, Packer) Secrets management tools (e.g Vault) Virtualization environments and tools (e.g. VMs, vSphere) Data and Observability tools (e.g. Splunk, Grafana, New Relic, Open Telemetry) Growth-oriented mindset About Electronic Arts We're proud to have an extensive portfolio of games and experiences, locations around the world More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom
IT Talent Solutions Ltd
Architect Expert, DevOps Engineer Expert). Experience in enterprise-scale environments or regulated industries. Exposure to hybrid cloud models, legacy system integration, and cloud migrations. Familiarity with monitoring and observability tools such as Azure Monitor, Application Insights, or Log Analytics. More ❯
Employment Type: Full-Time
Salary: £50,000 - £70,000 per annum, Negotiable, Inc benefits
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
reporting data is delivered on time and without failure.The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python.This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern cloud More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
data platform. Cleanse, transform, and model data to support BI tools (e.g., Power BI) and AI/ML use cases. Implement data validation, lineage tracking, and metadata tagging for observability and trust. Collaborate with stakeholders to standardise KPIs and support cross-functional reporting. Ensure secure handling of sensitive data in line with GDPR and internal policies. What We're Looking More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
data platform. Cleanse, transform, and model data to support BI tools (e.g., Power BI) and AI/ML use cases. Implement data validation, lineage tracking, and metadata tagging for observability and trust. Collaborate with stakeholders to standardise KPIs and support cross-functional reporting. Ensure secure handling of sensitive data in line with GDPR and internal policies. What We're Looking More ❯
concerns and driving service excellence. Communicate effectively with internal and external stakeholders, providing insights and updates on service health and operational performance. Continuous Improvement Lead initiatives to increase automation, observability, and operational resilience. Stay abreast of industry trends, emerging technologies, and best practices, fostering a culture of continuous learning within the team. Requirements Proven experience in IT Service Operations, ideally More ❯