Observability Jobs in the City of London

25 of 25 Observability Jobs in the City of London

DevOps Lead

City of London, London, United Kingdom
rmg digital
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Posted:

DevOps Lead

london (city of london), south east england, united kingdom
rmg digital
Jira, Team City Expert level knowledge of DevOps tools like Bitbucket/GitHub, Sonar Cube, CAST, Team City/Jenkins/Azure DevOps Expert level knowledge of telemetry and observability platforms like ELK stack, Grafana, Kibana, Azure Application Insights, AWS Cloud Watch etc., Scripting languages preferably python, PowerShell Database technologies preferably MS SQL Server, Postgres SQL Infrastructure as code – AWS More ❯
Posted:

AI Engineer

City of London, London, United Kingdom
Adecco
as-Code (IaC): Terraform, Pulumi Containerization & Orchestration: Docker, Kubernetes (GKE/AKS) Networking & Isolation: VPCs, private endpoints, firewall rules, network policies Data Sandboxing: Synthetic datasets, masking, DLP tooling Monitoring & Observability: Prometheus, Grafana, Cloud Logging More ❯
Employment Type: Contract
Rate: £850 - £950/day
Posted:

AWS Data Engineer

City of London, London, United Kingdom
HCLTech
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Posted:

AWS Data Engineer

london (city of london), south east england, united kingdom
HCLTech
Apache Airflow for orchestrating complex data workflows and ensuring reliable execution. Understanding of cloud security and governance practices including IAM, KMS, and data access policies. Experience with monitoring and observability tools such as CloudWatch. Experience working in Agile/Scrum environments, participating in sprint planning, retrospectives, and backlog grooming. Good to Have : Exposure to Azure data services such as Azure More ❯
Posted:

DevOps Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Talent
pipelines, monitoring, and infrastructure provisioning. Collaborate with developers and engineers to streamline deployments and workflows. Manage AWS services effectively and efficiently. Promote best practices in Infrastructure as Code (IaC), observability, and DevSecOps. Experience & Skills Required Active SC Clearance – mandatory requirement. Strong hands-on experience with AWS services. Proficiency in Terraform and IaC principles. Solid understanding of CI/CD pipelines More ❯
Posted:

DevOps Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Talent
pipelines, monitoring, and infrastructure provisioning. Collaborate with developers and engineers to streamline deployments and workflows. Manage AWS services effectively and efficiently. Promote best practices in Infrastructure as Code (IaC), observability, and DevSecOps. Experience & Skills Required Active SC Clearance – mandatory requirement. Strong hands-on experience with AWS services. Proficiency in Terraform and IaC principles. Solid understanding of CI/CD pipelines More ❯
Posted:

Senior Data Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Hlx Technology
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
Posted:

Senior Data Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Hlx Technology
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
Posted:

Azure AI DevOps Engineer - GBP 50000

City of London, London, United Kingdom
Nextech Group Limited
/CD pipelines with Azure DevOps, ensuring robust version control, testing, and seamless deployment. * Monitor production ML systems for performance, data drift, and anomalies using Azure Monitor or other observability tools. * Schedule and automate model retraining pipelines to maintain performance over time. 3. Data Engineering & Preprocessing * Develop and maintain scalable ETL/ELT data pipelines using Azure Data Factory, Data More ❯
Employment Type: Permanent
Salary: £50,000
Posted:

DevOps Engineer - London Market

City of London, London, United Kingdom
CBSbutler Holdings Limited trading as CBSbutler
segregation (dev, test, UAT, prod). - Automate infrastructure using Infrastructure as Code (Terraform, ARM, CloudFormation) - Embed security and compliance controls (SAST/DAST/IaC/SBOM). - Enable observability (logging, metrics, tracing, alerting) and support SRE/incident management practices. - Partner with client stakeholders to align DevOps with FCA/PRA operational resilience and Lloyd's standards. - Support disaster … Azure and/or AWS in enterprise or hybrid environments. - Familiarity with containerisation & orchestration (Docker, Kubernetes). - Solid understanding of security controls and compliance in financial services. - Experience with observability tools (Prometheus, Grafana, ELK, Splunk, AppDynamics, etc.). - Awareness of UK/EU financial regulations (GDPR, PRA/FCA, Lloyd's). - Consulting experience desirable - with the ability to engage More ❯
Employment Type: Permanent
Salary: £75000 - £100000/annum Bonus + Full Benefits
Posted:

Data Platform Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Rise Technical Recruitment Limited
data is delivered on time and without failure. The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python. This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern More ❯
Employment Type: Permanent, Work From Home
Salary: £90,000
Posted:

AWS DevOps Engineer

City of London, London, England, United Kingdom
Revybe IT Recruitment Ltd
and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Configuration Management Ansible Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python or Java (scripting, automation) GitHub Actions (CI/CD pipelines) What They’re Looking For Experience in AWS … cloud infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python or Java A good communicator who enjoys working collaboratively More ❯
Employment Type: Full-Time
Salary: £55,000 - £75,000 per annum
Posted:

Lead Site Reliability Engineer

City of London, London, United Kingdom
TechNET IT Recruitment Ltd
MY client are transforming observability with a modern, full-stack platform that delivers logs, metrics, traces, and security monitoring — cutting costs by up to 70% while boosting efficiency. They are looking for a Lead SRE to own and elevate our Alerting & Incident Management platform . You’ll be the driving force behind reliability, customer satisfaction, and product excellence — ensuring smooth More ❯
Posted:

AWS Platform Engineer

City of London, London, England, United Kingdom
Revybe IT Recruitment Ltd
and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Configuration Management Ansible Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) GitHub Actions (CI/CD pipelines) What They’re Looking For Experience in AWS cloud infrastructure (ideally in a … regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) A good communicator who enjoys working collaboratively across product and engineering The client is willing to take someone that doesn't More ❯
Employment Type: Full-Time
Salary: £55,000 - £75,000 per annum
Posted:

Lead Machine Learning Engineer

City of London, London, United Kingdom
Harnham
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
Posted:

Lead Machine Learning Engineer

london (city of london), south east england, united kingdom
Harnham
performing team of ML engineers. Combine ML with physics-based risk models (flooding, tropical cyclones, wildfires) to deliver grounded, high-impact solutions. Establish gold-standard practices for evaluation, deployment, observability, and maintainability in ML model development. Turn complex technical challenges into clear business outcomes for colleagues and customers. Requirements: MSc or PhD Degree in Computer Science, Artificial Intelligence, Mathematics, Statistics More ❯
Posted:

Platform Director

City of London, London, United Kingdom
Hlx Life Sciences
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
Posted:

Platform Director

london (city of london), south east england, united kingdom
Hlx Life Sciences
in biotech, pharma, or AI-driven drug discovery Experience in both large organisations (with structured processes and metrics) and smaller/startup environments (delivering with limited resources) Knowledge of observability and reliability practices for product platforms Security or compliance experience Why Join? Be part of a world-class AI-first research environment shaping the future of drug discovery Work on More ❯
Posted:

Elastic Stack/ELK Engineer

City, London, United Kingdom
Square One Resources
/Experience: Deep technical expertise with the Elastic Stack, including life cycle management, performance tuning, and API-driven integrations. Strong understanding of Linux systems, containerisation, networking, and cloud-native observability practices. Hands-on experience with automation and Scripting (Python, Shell, SQL, PowerShell). Knowledge of trading environments, connectivity, and financial protocols (FIX, Market Data, Order Entry). Background in DevOps … Desirable Skills/Experience: Experience with low-latency messaging systems such as Solace, 29West, or Tibco. Familiarity with monitoring and telemetry platforms such as Corvil or Pico. Knowledge of observability solutions like ITRS Geneos integrated with Elastic Stack. Experience deploying and supporting machine-learning models in production environments. If you are interested in this opportunity, please apply now with your More ❯
Employment Type: Contract
Rate: GBP 492 Daily
Posted:

Data Science Tech Lead: GenAI

City of London, London, United Kingdom
Hybrid / WFH Options
Anecdote
up and harden RAG pipelines (indexing, retrieval policies, grounding, guardrails) and agent frameworks. Take basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost tuning. Participate in on‐call for your area and drive root‐cause analysis with crisp follow‐ups. 15% Collaborate Pair with back‐end & front‐end to wire extractors … evals; hands‐on with time‐series analysis (forecasting, change‐point, drift). Cloud & ops: Basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost control. Communication: You explain results clearly, align stakeholders, and write crisp docs. Bonus points DevOps wizardry; GPU/accelerator experience. Multimodal pipelines (text + voice + screenshots). More ❯
Posted:

Data Science Tech Lead: GenAI

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Anecdote
up and harden RAG pipelines (indexing, retrieval policies, grounding, guardrails) and agent frameworks. Take basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost tuning. Participate in on‐call for your area and drive root‐cause analysis with crisp follow‐ups. 15% Collaborate Pair with back‐end & front‐end to wire extractors … evals; hands‐on with time‐series analysis (forecasting, change‐point, drift). Cloud & ops: Basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost control. Communication: You explain results clearly, align stakeholders, and write crisp docs. Bonus points DevOps wizardry; GPU/accelerator experience. Multimodal pipelines (text + voice + screenshots). More ❯
Posted:

Splunk ITSI Expert / Observability Engineer (Level 4)

City of London, London, United Kingdom
Randstad Technologies Recruitment
We are seeking a highly experienced Splunk ITSI Expert with 10+ years in observability to enhance our monitoring and analytics capabilities. Key Responsibilities: Design and implement advanced monitoring strategies using Splunk IT Service Intelligence (ITSI). Create service models, define KPIs, and build glass tables to visualize key business services. Utilize Splunk ES for security event monitoring and correlation searches. … Automate tasks and integrate systems using Python, Shell, or Perl scripting. Perform root cause analysis and anomaly detection by analyzing complex log data. Requirements: 10+ years experience in observability, with deep expertise in Splunk, especially ITSI. Proficiency in Scripting (Shell/PowerShell/Python). Strong understanding of Load Balancers such as F5, Netscaler, and AWS ELB. Hands-on experience More ❯
Employment Type: Contract
Rate: £300 - £380/day
Posted:

Talent Partner, AI Infrastructure & Engineering

City of London, London, United Kingdom
Hybrid / WFH Options
Nscale
Process Improvement & Market Insights Use funnel metrics and quality‐of‐hire data to iterate processes and raise bar. Stay current on infra/AI hiring trends (e.g., HPC, orchestration, observability, power & cooling in DCs) and translate into sourcing strategies. About You Heavy experience in Infrastructure & Engineering hiring for an AI Infra/DC/HPC organisation Technical fluency to credibly … engage infra candidates and understands GPUs & containers (Kubernetes/Docker), Linux fundamentals, basic networking & storage, observability and data-centre operations (power/cooling/uptime). Strong capability to design structured interviews and evaluate senior talent. Proven stakeholder management with technical leadership; able to influence with data. Experience in high‐growth start‐up/scale‐up environments; thrives in ambiguity. More ❯
Posted:

Dynatrace Subject Matter Expert - Data Resilience

City of London, London, United Kingdom
Adecco
End Monitoring coverage exists across the Group's business-critical applications. We are seeking a skilled Dynatrace Admin/Consultant to play a key role in the enablement of observability across complex, hybrid cloud environments. The ideal candidate will have deep expertise in Dynatrace implementation (SaaS and On-Premises), monitoring configuration, and AI-driven insights to support performance, reliability, and … Work together with all parties to identify opportunities for enhancement to monitoring configuration and capabilities across critical applications. * Participate in the review of roles and responsibilities between teams for observability and make recommendations for improvement of the standards with an emphasis on Operational Resilience. * Play a key part in providing an automatically maintained end to end business flow for each … Collaborate with Application Stewards and Site Reliability Engineers (SREs) to ensure altering configuration is optimal and fit for purpose. * Participate in workshops with third party software suppliers to review observability standards. What You'll Need: * The ability to demonstrate your extensive experience in designing and configuring the following within Dynatrace: o Application performance monitoring o Anomaly detection profiles o Alerting More ❯
Employment Type: Contract
Posted:
Observability
the City of London
10th Percentile
£65,000
25th Percentile
£71,250
Median
£77,500
75th Percentile
£83,750
90th Percentile
£94,250