Observability Jobs in England

376 to 400 of 688 Observability Jobs in England

Databricks Data Engineer Contract

London, South East, England, United Kingdom
Hybrid/Remote Options
Harnham - Data & Analytics Recruitment
developing and scaling the company's core data platform, ensuring teams across the business can access, trust, and use data effectively. You'll drive initiatives that improve data quality, observability, and governance, while helping shape a platform-as-a-product mindset. Key responsibilities include: Building and maintaining data infrastructure: Develop microservices, pipelines, and backend systems that power analytics and machine … learning initiatives. Driving platform evolution: Design and implement scalable, secure, and efficient data services using tools such as Terraform, Docker, and AWS. Data governance and observability: Introduce and enhance tooling for data lineage, contracts, monitoring, and cataloguing. Operational excellence: Lead automation, monitoring, and incident response to maintain high platform reliability. Cross-functional collaboration: Work with data scientists, ML engineers, analysts … Proven track record of designing, building, and scaling data platforms in production environments. Hands-on experience with big data technologies such as Airflow, DBT, Databricks, and data catalogue/observability tools (e.g. Monte Carlo, Atlan, Datahub). Knowledge of cloud infrastructure (AWS or GCP) - including services such as S3, RDS, EMR, ECS, IAM. Experience with DevOps tooling, particularly Terraform and More ❯
Employment Type: Contractor
Rate: £550 - £600 per day
Posted:

Site Reliability Engineer

Hereford, Herefordshire, West Midlands, United Kingdom
Hybrid/Remote Options
Hays
focused on ensuring service availability, performance, and cost-efficiency across both cloud and on-prem infrastructure. You'll work closely with development and support teams to evolve infrastructure, enhance observability, and proactively mitigate reliability risks. Key Responsibilities: Collaborate with software engineers to improve reliability and performance Automate operational tasks and reduce alert fatigue Enhance monitoring and observability to pre-empt … platforms, ideally AWS (EC2, RDS, S3, Lambda) Desirable: Coding experience in Java, Go, Python or similar Knowledge of cross-domain technologies Experience in service management environments Practical application of observability patterns Experience with Azure Additional Information: Due to the nature of the work, successful candidates will be required to undergo security vetting. We welcome applications from all backgrounds and are More ❯
Employment Type: Contract, Work From Home
Rate: £500.0 - £600.0 per day + £500-600 per day
Posted:

Site Reliability Engineer

South West London, London, United Kingdom
REVYBE IT RECRUITMENT LIMITED
make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python, Bash or Go (scripting, automation) GitHub Actions (CI/CD pipelines) What Theyre Looking For Experience in AWS … cloud infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python, Bash or Go A good communicator who enjoys working More ❯
Employment Type: Permanent
Salary: £75,000
Posted:

Platform Data Engineer

London, South East, England, United Kingdom
Harnham - Data & Analytics Recruitment
and non-technical partners to deliver resilient infrastructure, champion data governance, and mentor others in engineering excellence. In this role, you will: Shape the data platform roadmap: Introduce modern observability, quality, and governance frameworks that elevate how teams access and trust data. Build and scale infrastructure: Develop services, APIs, and data pipelines using modern cloud tooling and automation-first principles. … scalable data platforms in production, enabling advanced users such as ML and analytics engineers. Hands-on experience with modern data stack tools - Airflow, DBT, Databricks, and data catalogue/observability solutions like Monte Carlo, Atlan, or DataHub. Solid understanding of cloud environments (AWS or GCP), including IAM, S3, ECS, RDS, or equivalent services. Experience implementing Infrastructure as Code (Terraform) and … e.g., Jenkins, GitHub Actions). A mindset focused on continuous improvement, learning, and staying at the forefront of emerging technologies. Nice to Have Experience rolling out data governance and observability frameworks, including lineage tracking, SLAs, and data quality monitoring. Familiarity with modern data lake table formats such as Delta Lake, Iceberg, or Hudi. Background in stream processing (Kafka, Flink, or More ❯
Employment Type: Contractor
Rate: £500 - £600 per day
Posted:

Senior Azure DevOps Engineer

Nottingham, England, United Kingdom
Digital Waffle
and tuning system performance across multiple services and environments. Supporting development teams with deployment pipelines, CI/CD processes, and platform tools. Troubleshooting complex application and infrastructure challenges. Championing observability, incident response, and continuous improvement within SRE practices. What We’re Looking For Strong experience with Microsoft Azure and cloud-native technologies. Deep knowledge of Terraform, Kubernetes, and App Services. … pipelines with Azure DevOps. Experience in a Site Reliability, DevOps, or Platform Engineering role. Solid scripting or programming ability (PowerShell, Bash, Python, or similar). Familiarity with monitoring and observability tools such as Datadog, Azure Application Insights, or Log Analytics. Excellent collaboration and communication skills with the ability to work cross-functionally. A proactive mindset with a genuine passion for More ❯
Posted:

Cloud Architect

Oxford, England, United Kingdom
Experis UK
network segmentation). Lead migration and modernisation (re‐host/re‐platform/re‐factor) for priority applications. Implement IaC at scale (Terraform preferred; standard modules; pipelines). Build observability (logs, metrics, traces, SLOs) and resilience (HA, DR, RTO/RPO). Drive FinOps —cost transparency, budgets, showback/chargeback, right‐sizing. Embed security‐by‐design and compliance (CIS, NIST … warehouses (BigQuery, Synapse, Redshift), ETL/ELT. API strategy (APIM/API Gateway/Apigee), messaging (SQS/SNS/Service Bus/PubSub), event‐driven design. Operations & Reliability Observability stack (CloudWatch/CloudTrail, Azure Monitor/Log Analytics, Cloud Logging/Monitoring; Prometheus/Grafana). DR/BCP architectures (cross‐region, multi‐region, backups, runbooks; tested failover). … KMS. Data/Integration: Event Hubs/Kafka/PubSub, API Gateway/APIM/Apigee, Data Factory/Glue/Cloud Data Fusion, BigQuery/Synapse/Redshift. Observability: Prometheus/Grafana, OpenTelemetry, CloudWatch, Azure Monitor, Cloud Monitoring, ELK/Elastic. Scripting: Python/Bash/PowerShell; strong Git and code review practices. Certifications (Nice to Have) Azure: AZ More ❯
Posted:

Senior Platform Engineer

City Of London, England, United Kingdom
develop
Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
Posted:

DevOps Engineer

Nottingham, England, United Kingdom
GTS Group Ltd
based microservices. Troubleshoot production issues, ensuring uptime and documenting processes on the internal wiki. Automate deployments, testing processes, and infrastructure provisioning (Terraform, Ansible, GitHub Actions). Implement monitoring and observability solutions for proactive issue detection. Provide occasional support for internal IT infrastructure (e.g., laptops, printers, office networking). Occasionally maintain and support CMS platforms (Magento, Joomla, WordPress). Experience Required … management) Docker containerization Python scripting for automation Git version control Desirable (Future-Facing Skills): Infrastructure as Code (Terraform, Pulumi, Ansible) Container orchestration (Kubernetes) Go development for microservice utilities Modern observability tools (Prometheus, Grafana, Datadog) CI/CD pipeline management (GitHub Actions, GitLab CI, Jenkins) Firewall-as-a-Service solutions (e.g., Cloudflare) Endpoint/device management (e.g., Intune, NinjaOne) Exposure to More ❯
Posted:

Performance Tester

City of London, London, United Kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Performance Tester

London Area, United Kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Head of Infrastructure

London Area, United Kingdom
Hybrid/Remote Options
Harnham
the following: Technical tasks Architecting and scaling cloud infrastructure (GCP preferred) and high-performance computing environments Leading the design and implementation of DevOps platforms, CI/CD pipelines, and observability tools (Terraform, Docker, Kubernetes, Jenkins) Partnering with engineering and R&D to define technical roadmaps for compute and infrastructure products Other key responsibilities Managing and mentoring a team, fostering a … GitHub Actions; Terraform or CloudFormation; Prometheus, Grafana, Datadog, or New Relic; Slurm, Torque, LSF; MPI; Hadoop or Spark;Director of In Experience with high-performance computing, distributed systems, and observability tools Strong communication and executive presence, with the ability to translate complex technical concepts for diverse audiences Familiarity with AI/ML operations is a plus BENEFITS The successful Director More ❯
Posted:

Head of Infrastructure

City of London, London, United Kingdom
Hybrid/Remote Options
Harnham
the following: Technical tasks Architecting and scaling cloud infrastructure (GCP preferred) and high-performance computing environments Leading the design and implementation of DevOps platforms, CI/CD pipelines, and observability tools (Terraform, Docker, Kubernetes, Jenkins) Partnering with engineering and R&D to define technical roadmaps for compute and infrastructure products Other key responsibilities Managing and mentoring a team, fostering a … GitHub Actions; Terraform or CloudFormation; Prometheus, Grafana, Datadog, or New Relic; Slurm, Torque, LSF; MPI; Hadoop or Spark;Director of In Experience with high-performance computing, distributed systems, and observability tools Strong communication and executive presence, with the ability to translate complex technical concepts for diverse audiences Familiarity with AI/ML operations is a plus BENEFITS The successful Director More ❯
Posted:

Senior Rust Software Engineer

England, United Kingdom
Moody's Investors Service
experience designing and working with relational database schemas Excellent problem solving and communication skills, with a collaborative mindset Proficient in incremental software delivery leveraging agile processes Experience with software observability practices (distributed tracing, OpenTelemetry, etc.) Basic understanding of artificial intelligence concepts, with curiosity and enthusiasm for learning how AI tools can be used to improve processes and drive efficiency. Interest … systems Collaborate with cross functional teams including Product, QA, and DevOps Mentor junior engineers and promote engineering best practices Ensure code quality, security, and performance across all deliverables Champion observability and ensure software is observable, maintainable and resilient About the team Our Corp & Gov Technology team is responsible for delivering innovative software solutions that support Moody's public and private More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Cloud Infrastructure Engineer

Bristol, Avon, South West, United Kingdom
Hybrid/Remote Options
Hargreaves Lansdown
HL version control set) with quality gates, automated testing, security scanning, and progressive delivery. Introduce and run GitOps for Kubernetes (AKS preferred), patterns and multi-environment promotions. Own platform observability: metrics, logs and traces using Azure Monitor/Log Analytics/Application Insights, plus Datadog/Grafana where appropriate. Embed security by design: Azure Policy, Defender for Cloud, secrets management … cluster operations, node pools, networking (CNI), ingress, secrets, RBAC and workload identity. Experience with GitOps, and container build pipelines (e.g., ACR, OPA policies, image scanning). Working knowledge of observability tooling (Azure Monitor, Log Analytics, Application Insights, Datadog/Grafana) and alerting/response workflows. Understanding of the Microsoft Cloud Adoption Framework, Azure Landing Zones and the Well-Architected Framework. More ❯
Employment Type: Permanent, Part Time, Work From Home
Posted:

Senior Data Engineer

England, United Kingdom
Hybrid/Remote Options
Harvey Nash
looking for an experienced Data Engineer to support on an initial 6 Month Contract engagement. You will own their data platform end to end, from ingestion & modelling to orchestration, observability & governance. You'll be responsible for designing & building robust, reliable pipelines, evolving their lakehouse/warehouse layers & enable fast, trustworthy analytics for multiple teams. Tech you'll be working with More ❯
Posted:

Linux Production Engineer

London Area, United Kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Linux Production Engineer

City of London, London, United Kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Cloud Engineer - Azure

London Area, United Kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:

Cloud Engineer - Azure

City of London, London, United Kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:

Cloud Engineer

City of London, London, United Kingdom
algo1
how to manage workloads at scale. Proficient with Infrastructure as Code tools and practices. Comfortable writing automation, configuration, and tooling to simplify operations and reduce manual effort. Knowledgeable about observability tools & best practices. Ability to collaborate across teams with excellent written and verbal communication skills. Nice to Have Qualifications: Experience with multi-cloud and/or hybrid deployments. Knowledge of More ❯
Posted:

Cloud Engineer

London Area, United Kingdom
algo1
how to manage workloads at scale. Proficient with Infrastructure as Code tools and practices. Comfortable writing automation, configuration, and tooling to simplify operations and reduce manual effort. Knowledgeable about observability tools & best practices. Ability to collaborate across teams with excellent written and verbal communication skills. Nice to Have Qualifications: Experience with multi-cloud and/or hybrid deployments. Knowledge of More ❯
Posted:

DevOps Lead

Birmingham, West Midlands, United Kingdom
Hybrid/Remote Options
Robert Walters
to improve performance Develop strategies to improve performance across group technology DevOps Lead: Experience Technical dept across but not limited to: Java, UNIX, Linux, Middleware, Web-Logic, Cloud Platforms Observability tools Designing/Developing/Implementing technology advancements Experience of improving resilience of complex production environments The permanent opportunity for a DevOps Lead will pay a salary range of More ❯
Employment Type: Permanent, Work From Home
Salary: £80,000
Posted:

Analytics Engineer

London, United Kingdom
Tenth Revolution Group
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
Employment Type: Permanent
Salary: £70000 - £85000/annum
Posted:

Analytics Engineer

London, South East, England, United Kingdom
Tenth Revolution Group
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
Employment Type: Full-Time
Salary: £70,000 - £85,000 per annum
Posted:

Python Developer

Hammersmith, England, United Kingdom
Understanding Recruitment
Design schemas and pipelines across Postgres and MongoDB Run CI and CD, improve build times, handle deployments and rollbacks Collaborate with data and ML to productionise models Instrument for observability and own incidents end to end What you will bring 1+ year engineering with strong Python in production Hands on Elasticsearch experience Solid SQL plus practical MongoDB CI and CD More ❯
Posted:
Observability
England
10th Percentile
£56,250
25th Percentile
£67,500
Median
£80,000
75th Percentile
£105,000
90th Percentile
£146,000