GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the worlds most well-known organisations. Youll play a key role in helping our customers achieve greater … visibility, performance, and reliability across their IT estatescontributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms … with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring & Observability SME within customer delivery teams Support incident response activities and postmortems by identifying patterns, root causes, and optimisation opportunities Work collaboratively with cross-functional teams to define and implement best practices in observability and monitoring Attend customer and More ❯
Manual Tester (DV Security Clearance) Position Description Are you an experienced Test Analyst with a background in secure or classified programmes, ready to contribute to projects of national importance? Step into a role where you'll challenge the complex to More ❯
will involve designing robust software solutions that enhance system performance while ensuring high availability for critical applications. You will work hand-in-hand with product engineering teams to improve observability tools and telemetry systems, driving forward automation initiatives that reduce manual intervention. By participating in incident management processes-facilitating transparent communication with stakeholders and leading blameless post-mortems-you will … a focus on automating these activities wherever possible.* Provide on-call support during production incidents outside standard working hours as required by the business needs.* Contribute to enhancing product observability and telemetry by supporting ongoing modernisation efforts within the infrastructure.* Collaborate closely with engineering teams to brainstorm ideas that simplify infrastructure management and streamline SRE practices. What you bring: * Proficiency More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
for someone who has: Strong .NET framework knowledge (C#,ASP.NET Core etc..) Expertise in Windows Server administration Database administration (SQL Server primarily) Ability to instrument and consume monitoring and observability tools (Application Insights, Prometheus, Grafana) Experience using PowerShell, Azure CLI, and Bash for automation tasks Previous experience with Azure DevOps, Jenkins, GitHub Actions, or similar tools Containerisation and orchestration (Docker More ❯
our enterprise messaging infrastructure, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, network optimization, and system observability using industry-standard monitoring tools. Required Skills & Qualifications: 3+ years of experience administering enterprise-grade messaging systems. Strong background in production support, preferably in a 24x7 enterprise environment. Experience working More ❯
incidents using data-driven decision making to minimise downtime and financial impact while leading root cause analysis and conducting blameless post-mortems.* Enhance application health monitoring by implementing robust observability solutions and automating manual processes to improve system resilience.* Drive cost optimisation initiatives and manage capacity resources to ensure efficient and scalable operations across all FX trading platforms.* Collaborate with … Deep technical expertise in Linux/Unix systems administration combined with strong SQL skills and proficiency in scripting languages such as Python or Java.* Demonstrated experience with monitoring and observability tools including Prometheus, Grafana, Splunk, Geneos, OpenTelemetry or Corvil is highly desirable.* Familiarity with cloud platforms as well as containerisation technologies like Kubernetes or Docker alongside CI/CD pipeline More ❯
our production systems. Key Responsibilities Design, implement, and manage AWS cloud infrastructure. Develop and maintain automation scripts and tooling. Support production systems and ensure high availability and performance. Implement observability and monitoring solutions. Collaborate closely with the PBS (Platform/Backend Services) team. Contribute to infrastructure as code (IaC) and DevOps best practices. Requirements Hands-on experience with AWS. Automation … experience (e.g., Terraform, Ansible, CI/CD tools). Strong understanding of infrastructure and cloud architecture. Experience supporting production environments. Familiarity with observability tools (e.g., Prometheus, Grafana, CloudWatch). Excellent problem-solving and communication skills. Desirable Experience working in a fast-paced or agile development environment. Familiarity with container technologies (e.g., Docker, Kubernetes). Previous experience in a similar role More ❯
a key member of the Dynatrace sales engine and will be responsible for providing excellent technical support to the sales team. You will be the expert on Dynatrace and observability, with a specialization in Log Management and Analytics. Within this exciting role, you will be responsible for executing great demos which demonstrate the Dynatrace unique approach in solving the customer … be filled at a higher level based on candidate experience. What will help you succeed Preferred Requirements: Experience with query languages such as SQL, SPL, or KQL. Experience with observability and log collectors/pipelines such as FluentBit, OpenTelemetry, Cribl, and Logstash. Experience with web technologies such as HTML, CSS, and JavaScript. Experience with programming/scripting side technologies such … OpenShift, Serverless functions, and CI/CD pipelines. Experience with automation like Ansible, Puppet, Terraform, etc. Why you will love being a Dynatracer Dynatrace is a leader in unified observability and security. We provide a culture of excellence with competitive compensation packages designed to recognize and reward performance. Our employees work with the largest cloud providers, including AWS, Microsoft, and More ❯
Bracknell, Berkshire, South East, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
South East London, London, United Kingdom Hybrid / WFH Options
TEN10 SOLUTIONS LIMITED
stakeholder management skills. Nice-to-Have: Hands-on experience with Databricks , Apache Spark , and Azure Deequ . Familiarity with Big Data tools and distributed data processing. Experience with data observability and data quality monitoring. Proficiency with CI/CD tools like Jenkins, Azure DevOps, or GitLab CI. Previous consultancy or client-facing experience. Additional languages like SQL, TypeScript, or Bash More ❯
Position Summary We are looking for an experienced Systems Engineer with strong Linux and Kubernetes experience to join our Group Engineering - Systems team. You will help design, build and operate modern infrastructure platforms that support continually evolving applications and services. More ❯
Portsmouth, Hampshire, United Kingdom Hybrid / WFH Options
Checkatrade
Hybrid working. Where do you fit in? We're seeking a Senior Platform Engineer with a strong background in cloud-native technologies and a passion for automation, DevOps, and observability practices. You'll be at the forefront of building and maintaining our infrastructure using tools like Kubernetes, Terraform, Helm, and Datadog. You will drive the adoption of infrastructure-as-code … AWS is also valuable, with a willingness to work within a GCP environment. Experience with programming languages such as Golang, Python, and JavaScript. Passion for automation, DevOps, SRE, and observability practices. Proven leadership, management skills, and excellent communication abilities. We are an equal opportunities employer committed to diversity and inclusion in the workplace. About us We're Checkatrade, the UK More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom
Noir
financial institution with soaring profits - my client is modernising platforms, embracing AI, and driving automation at scale. We're hiring a Lead Site Reliability Engineer (SRE) to drive reliability, observability, and performance across our Azure cloud infrastructure. You'll work in a modern engineering environment where we live by "you build it, you run it", focused on automation, scale, and More ❯
years in platform/SRE/DevOps roles * Strong Kubernetes experience (config and deployment) * Deep CI/CD experience - Jenkins, GitLab CI/CD or similar * Skilled with infra observability tooling (Prometheus, Grafana, etc.) * Confident with Git and repo management workflows * Strong automation mindset - reducing manual intervention wherever possible * Cloud experience (AWS, Azure or GCP) * Must be a sole UK More ❯
Reigate, Surrey, South East, United Kingdom Hybrid / WFH Options
Client Server
of IaC principles and tools such as Terraform and Pulumi You have experience of building and improving CI/CD pipelines for product teams You have experience with cloud observability (logging, tracing, metrics, monitoring and alerting) You have experience with Containerisation - Azure Container Apps preferred You have strong scripting skills with PowerShell and/or C# .Net coding You enjoy More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
day-to-day and strategic decision making.You will be a hands-on and customer focused engineering servant-leader. You will be comfortable moving across orchestration, automation, pipelines, cloud services, observability and security domains (even if you are not an expert in them all). A non-negotiable is experience and familiarity with Microsoft Azure.You will play your part in operating More ❯
Pipelines is a plus. Experience with multi-cloud and hybrid cloud environments. Experience with Elastic (or OpenSearch) and Grafana Knowledge of ServiceNOWfor change management and incident management. Familiarity with observability tools and practices for 24x7x365 monitoring and alerting. Identity and Access Management experience is a plus for this role LI RB1 LI Remote LI Hybrid About Bentley Systems Bentley Systems More ❯
and CI/CD workflows (GitLab CI). Write clean, production-grade code in Python (Scala is a bonus). Build infrastructure using Terraform, AWS CloudFormation, or SAM. Drive observability across the platform using Datadog or CloudWatch. Actively mentor Data Engineers and Associates, and lead technical discussions and design sessions. Key requirements: Must-Have: Strong experience with AWS services: Glue More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
Assist in migrating applications and services into a hybrid cloud environment (AWS)* Use Infrastructure as Code tools (Terraform, Ansible) to manage environments* Help automate deployment pipelines and ensure system observability* Act as escalation point for infrastructure-related support tickets* Collaborate with engineering and customer success teams to support streaming platform uptime Ideal Candidate Strong Linux systems engineering and troubleshooting experience … centre access* Driving licence required due to physical server access needs* Proactive, flexible and calm under pressure Tech Stack/Tools Linux* Terraform* Ansible* AWS (Hybrid migration underway)* Monitoring & observability tooling Benefits £75,000 base salary* Time off in lieu for any out-of-hours work* 25 days holiday* Fully remote working (must be within travel distance to London)* Dynamic More ❯
EngineeringHybrid Remote , London,United KingdomReading,United Kingdom Splunk - a Cisco company, provides the Unified Security and Observability Platform. The world's leading organisations trust Splunk to go from insight to action fast and at scale; organisations such as McLaren, Heineken, and Tesco are turning data into action with Splunk. Join us as we pursue our innovative vision to make machine … IT architecture concepts such as High Availability, Disaster Recovery Highly Desirable Knowledge and Experience; I have some or all of these too: Domain knowledge in any of: security operations, Observability, DevOps, IT operations, big data or log management. Experience writing and using regular expressions. Experience coding in Python. Experience working with REST APIs. Experience with container and container orchestration technology. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Morgan Hunt Recruitment
Postgres) Implement OCR, NLP, and ML for document analysis and automation risk assessment Lead R&D spikes and validate system improvements through robust data analysis Ensure code quality, testing, observability, and non-functional compliance (security, UX, performance) Coach team members and contribute to Agile delivery practices Essential Skills Strong commercial experience with Python, TypeScript, SpaCy, and AWS (serverless) Background in More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
and Analytics to ensure data accessibility and reliability* Optimise data infrastructure and automate manual processes* Uphold data integrity with strong validation and testing practices* Drive improvements in data delivery, observability, and internal tooling* Support cross-functional teams in leveraging data for decision-making Ideal Candidate Strong proficiency in Python and SQL for data wrangling and automation* Experience with ML workflows More ❯
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture. Experience in some of the following would be ideal Partner with engineering, finance and product teams to drive cost-efficiency across Azure Clear understanding More ❯
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture. Experience in some of the following would be ideal Partner with engineering, finance and product teams to drive cost-efficiency across Azure Clear understanding More ❯