Observability Jobs in London

476 to 500 of 819 Observability Jobs in London

Databricks Data Engineer (Contract)

London, England, United Kingdom
Hybrid / WFH Options
Harnham
developing and scaling the company's core data platform, ensuring teams across the business can access, trust, and use data effectively. You'll drive initiatives that improve data quality, observability, and governance, while helping shape a platform-as-a-product mindset. Key responsibilities include: Building and maintaining data infrastructure: Develop microservices, pipelines, and backend systems that power analytics and machine … learning initiatives. Driving platform evolution: Design and implement scalable, secure, and efficient data services using tools such as Terraform, Docker, and AWS. Data governance and observability: Introduce and enhance tooling for data lineage, contracts, monitoring, and cataloguing. Operational excellence: Lead automation, monitoring, and incident response to maintain high platform reliability. Cross-functional collaboration: Work with data scientists, ML engineers, analysts … Proven track record of designing, building, and scaling data platforms in production environments. Hands-on experience with big data technologies such as Airflow, DBT, Databricks, and data catalogue/observability tools (e.g. Monte Carlo, Atlan, Datahub). Knowledge of cloud infrastructure (AWS or GCP) - including services such as S3, RDS, EMR, ECS, IAM. Experience with DevOps tooling, particularly Terraform and More ❯
Posted:

Databricks Data Engineer Contract

London, South East, England, United Kingdom
Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
developing and scaling the company's core data platform, ensuring teams across the business can access, trust, and use data effectively. You'll drive initiatives that improve data quality, observability, and governance, while helping shape a platform-as-a-product mindset. Key responsibilities include: Building and maintaining data infrastructure: Develop microservices, pipelines, and backend systems that power analytics and machine … learning initiatives. Driving platform evolution: Design and implement scalable, secure, and efficient data services using tools such as Terraform, Docker, and AWS. Data governance and observability: Introduce and enhance tooling for data lineage, contracts, monitoring, and cataloguing. Operational excellence: Lead automation, monitoring, and incident response to maintain high platform reliability. Cross-functional collaboration: Work with data scientists, ML engineers, analysts … Proven track record of designing, building, and scaling data platforms in production environments. Hands-on experience with big data technologies such as Airflow, DBT, Databricks, and data catalogue/observability tools (e.g. Monte Carlo, Atlan, Datahub). Knowledge of cloud infrastructure (AWS or GCP) - including services such as S3, RDS, EMR, ECS, IAM. Experience with DevOps tooling, particularly Terraform and More ❯
Employment Type: Contractor
Rate: £550 - £600 per day
Posted:

Site Reliability Engineer

South West London, London, United Kingdom
REVYBE IT RECRUITMENT LIMITED
make technical decisions and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python, Bash or Go (scripting, automation) GitHub Actions (CI/CD pipelines) What Theyre Looking For Experience in AWS … cloud infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python, Bash or Go A good communicator who enjoys working More ❯
Employment Type: Permanent
Salary: £75,000
Posted:

Platform Data Engineer

London, England, United Kingdom
Harnham
and non-technical partners to deliver resilient infrastructure, champion data governance, and mentor others in engineering excellence. In this role, you will: Shape the data platform roadmap: Introduce modern observability, quality, and governance frameworks that elevate how teams access and trust data. Build and scale infrastructure: Develop services, APIs, and data pipelines using modern cloud tooling and automation-first principles. … scalable data platforms in production, enabling advanced users such as ML and analytics engineers. Hands-on experience with modern data stack tools - Airflow, DBT, Databricks, and data catalogue/observability solutions like Monte Carlo, Atlan, or DataHub. Solid understanding of cloud environments (AWS or GCP), including IAM, S3, ECS, RDS, or equivalent services. Experience implementing Infrastructure as Code (Terraform) and … e.g., Jenkins, GitHub Actions). A mindset focused on continuous improvement, learning, and staying at the forefront of emerging technologies. Nice to Have Experience rolling out data governance and observability frameworks, including lineage tracking, SLAs, and data quality monitoring. Familiarity with modern data lake table formats such as Delta Lake, Iceberg, or Hudi. Background in stream processing (Kafka, Flink, or More ❯
Posted:

Platform Data Engineer

London, South East, England, United Kingdom
Harnham - Data & Analytics Recruitment
and non-technical partners to deliver resilient infrastructure, champion data governance, and mentor others in engineering excellence. In this role, you will: Shape the data platform roadmap: Introduce modern observability, quality, and governance frameworks that elevate how teams access and trust data. Build and scale infrastructure: Develop services, APIs, and data pipelines using modern cloud tooling and automation-first principles. … scalable data platforms in production, enabling advanced users such as ML and analytics engineers. Hands-on experience with modern data stack tools - Airflow, DBT, Databricks, and data catalogue/observability solutions like Monte Carlo, Atlan, or DataHub. Solid understanding of cloud environments (AWS or GCP), including IAM, S3, ECS, RDS, or equivalent services. Experience implementing Infrastructure as Code (Terraform) and … e.g., Jenkins, GitHub Actions). A mindset focused on continuous improvement, learning, and staying at the forefront of emerging technologies. Nice to Have Experience rolling out data governance and observability frameworks, including lineage tracking, SLAs, and data quality monitoring. Familiarity with modern data lake table formats such as Delta Lake, Iceberg, or Hudi. Background in stream processing (Kafka, Flink, or More ❯
Employment Type: Contractor
Rate: £500 - £600 per day
Posted:

DevOps Engineer

London Area, United Kingdom
Venture Up
GitLab CI, GitHub Actions, Jenkins, or similar) Proficiency in scripting languages (Python, Bash, or Ruby) Experience with containerization technologies (Docker, and ideally Kubernetes or ECS) Knowledge of monitoring and observability tools (Prometheus, Grafana, CloudWatch, or similar) Understanding of networking concepts, security best practices, and IAM management Experience with configuration management and automation tools Strong problem-solving skills and ability to … and optimising cloud spend to ensure cost efficiency Owning CI/CD pipelines, improving speed and reliability of deployments Providing internal tooling and automation to support development teams Improving observability through logging, metrics, and tracing Ensuring high availability, scalability, and disaster recovery planning Managing security, IAM policies, and compliance best practices Working closely with developers to support deployments and testing More ❯
Posted:

DevOps Engineer

City of London, London, United Kingdom
Venture Up
GitLab CI, GitHub Actions, Jenkins, or similar) Proficiency in scripting languages (Python, Bash, or Ruby) Experience with containerization technologies (Docker, and ideally Kubernetes or ECS) Knowledge of monitoring and observability tools (Prometheus, Grafana, CloudWatch, or similar) Understanding of networking concepts, security best practices, and IAM management Experience with configuration management and automation tools Strong problem-solving skills and ability to … and optimising cloud spend to ensure cost efficiency Owning CI/CD pipelines, improving speed and reliability of deployments Providing internal tooling and automation to support development teams Improving observability through logging, metrics, and tracing Ensuring high availability, scalability, and disaster recovery planning Managing security, IAM policies, and compliance best practices Working closely with developers to support deployments and testing More ❯
Posted:

DevOps Engineer

london, south east england, united kingdom
Venture Up
GitLab CI, GitHub Actions, Jenkins, or similar) Proficiency in scripting languages (Python, Bash, or Ruby) Experience with containerization technologies (Docker, and ideally Kubernetes or ECS) Knowledge of monitoring and observability tools (Prometheus, Grafana, CloudWatch, or similar) Understanding of networking concepts, security best practices, and IAM management Experience with configuration management and automation tools Strong problem-solving skills and ability to … and optimising cloud spend to ensure cost efficiency Owning CI/CD pipelines, improving speed and reliability of deployments Providing internal tooling and automation to support development teams Improving observability through logging, metrics, and tracing Ensuring high availability, scalability, and disaster recovery planning Managing security, IAM policies, and compliance best practices Working closely with developers to support deployments and testing More ❯
Posted:

DevOps Engineer

london (city of london), south east england, united kingdom
Venture Up
GitLab CI, GitHub Actions, Jenkins, or similar) Proficiency in scripting languages (Python, Bash, or Ruby) Experience with containerization technologies (Docker, and ideally Kubernetes or ECS) Knowledge of monitoring and observability tools (Prometheus, Grafana, CloudWatch, or similar) Understanding of networking concepts, security best practices, and IAM management Experience with configuration management and automation tools Strong problem-solving skills and ability to … and optimising cloud spend to ensure cost efficiency Owning CI/CD pipelines, improving speed and reliability of deployments Providing internal tooling and automation to support development teams Improving observability through logging, metrics, and tracing Ensuring high availability, scalability, and disaster recovery planning Managing security, IAM policies, and compliance best practices Working closely with developers to support deployments and testing More ❯
Posted:

Senior Platform Engineer

City Of London, England, United Kingdom
develop
Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
Posted:

Senior Platform Engineer

london, south east england, united kingdom
develop
Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
Posted:

Senior Platform Engineer

london (city of london), south east england, united kingdom
develop
Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
Posted:

Senior Platform Engineer

London, UK
Develop
Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
Employment Type: Full-time
Posted:

Performance Tester

City of London, London, United Kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Performance Tester

London Area, United Kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Performance Tester

london, south east england, united kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Performance Tester

london (city of london), south east england, united kingdom
Bestman Solutions
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
Posted:

Linux Production Engineer

City of London, London, United Kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Linux Production Engineer

London Area, United Kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Linux Production Engineer

london, south east england, united kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Linux Production Engineer

london (city of london), south east england, united kingdom
Autonomai Recruitment
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Posted:

Cloud Engineer - Azure

City of London, London, United Kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:

Cloud Engineer - Azure

London Area, United Kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:

Cloud Engineer - Azure

london, south east england, united kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:

Cloud Engineer - Azure

london (city of london), south east england, united kingdom
Vallum Associates
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Posted:
Observability
London
10th Percentile
£64,500
25th Percentile
£73,750
Median
£90,000
75th Percentile
£115,000
90th Percentile
£158,500