626 to 650 of 819 Observability Jobs in the UK

Principal Machine Learning Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
data engineering teams to implement scalable data lakehouse oriented feature architectures and enterprise‐grade ML governance. Champion engineering standards for model quality, documentation, observability, and platform resilience. Feature Engineering & Data Architecture Architect highly scalable, production‐ready feature pipelines within Lakehouse environments. Set the technical direction for fallback and resilience strategies … including scoring metrics, latency, error analytics, and SLOs. Partner with platform teams to optimise cost, scale, and reliability of inference endpoints. Monitoring, Drift Detection & Observability Define observability standards for feature drift, concept drift, performance degradation, and data integrity. Lead the creation of dashboards, benchmarks, and automated alerting across ...

Data Engineer

Hiring Organisation
LMA Recruitment
Location
South West London, London, England, United Kingdom
Employment Type
Contractor
Contract Rate
£300 - £350 per day
candidate will be responsible for building and maintaining scalable data pipelines within Google Cloud Platform (GCP), ensuring high levels of data quality, reliability, and observability across critical business data platforms. Key Responsibilities Build, maintain, and optimise scalable data pipelines within GCP Develop and manage workflows using Cloud Composer and Apache … Airflow Design and support data solutions using BigQuery Implement data quality checks and monitoring frameworks Improve observability and operational performance of data platforms Troubleshoot and resolve pipeline failures and performance issues Work closely with engineering, analytics, and product teams Follow best practices around testing, deployment, and documentation Required Skills & Experience ...

Data Engineer

Hiring Organisation
Ashdown Group
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £95,000 per annum
able to work from home 2 days per week.This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. You’ll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. … platforms.You’ll have excellent SQL and Python skills coupled with experience working in modern cloud-based data environments. Hands-on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within ...

Artificial Intelligence Engineer

Hiring Organisation
Omnis Partners
Location
United Kingdom
into production. This business specialises in solving the difficult engineering challenges that emerge when AI becomes business-critical. Their work spans agentic systems, governance, observability, evaluation frameworks, security, monitoring and enterprise-scale deployment. Founded by seriously impressive and well-networked leaders from top-tier consulting and transformation backgrounds, the business … Building scalable backend services using modern Python and cloud-native technologies Designing evaluation frameworks and testing harnesses to measure AI quality and reliability Implementing observability, monitoring and tracing across AI systems Managing challenges such as hallucinations, prompt injection, latency and model drift Building secure, reliable deployment pipelines for enterprise environments ...

Artificial Intelligence Engineer

Hiring Organisation
Omnis Partners
Location
United Kingdom, UK
into production. This business specialises in solving the difficult engineering challenges that emerge when AI becomes business-critical. Their work spans agentic systems, governance, observability, evaluation frameworks, security, monitoring and enterprise-scale deployment. Founded by seriously impressive and well-networked leaders from top-tier consulting and transformation backgrounds, the business … Building scalable backend services using modern Python and cloud-native technologies Designing evaluation frameworks and testing harnesses to measure AI quality and reliability Implementing observability, monitoring and tracing across AI systems Managing challenges such as hallucinations, prompt injection, latency and model drift Building secure, reliable deployment pipelines for enterprise environments ...

Director, Principal Java Engineer (Investment Banking)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£140,000 - £170,000 per annum
volumes of financial and transactional data Contribute directly to architecture, system design, and hands-on software development Drive engineering best practices across automation, testing, observability, and performance Build resilient, production-grade systems with a strong focus on reliability and scalability Work across the full software development lifecycle from design through … scalability, and high-availability systems Experience building automated, production-grade platforms with minimal manual intervention Familiarity with cloud-native technologies, CI/CD, and observability tooling Strong engineering mindset with a hands-on approach to development Interest in modern engineering tooling, including AI-assisted development workflows Robert Walters Operations Limited ...

Site Reliability Engineer

Hiring Organisation
Lorien
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Contractor
Contract Rate
Salary negotiable
production incidents, taking ownership through to resolution. Focus on incident response, service restoration and operational excellence (approximately 70% of the role). Improve system observability, monitoring and alerting capabilities. Work closely with development teams to enhance the reliability and operability of applications. Analyse production issues and identify opportunities for automation … Production Engineering or a similar operational engineering role. Strong hands-on experience supporting live production environments. Excellent troubleshooting and incident management skills. Experience with observability and monitoring platforms, including: Grafana Open Telemetry Splunk Good understanding of cloud platforms (AWS experience preferred). Strong knowledge of APIs and API troubleshooting. Experience ...

Senior Engineering Manager, Developer Experience

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
About the team The Developer Experience team owns the internal platform that every the company engineer touches daily: CI/CD pipelines, observability tooling, our developer portal, and an emerging AI platform. It's a high-visibility role: the work you lead directly shapes the productivity of hundreds of engineers … looks like at the company as we scale. What you'll do Lead and develop a growing team of 5+ highly motivated engineers across observability, CI/CD, developer portal (Backstage), and FinOps tooling — setting clear priorities and establishing strong ways of working. Own and evolve the technical roadmap across ...

Platform Operations Director

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
business continuity across the group. Internal IT & Systems Manages internal IT and business systems administration (M365, NetSuite, SuccessFactors, SharePoint) —infrastructure, integrations, and IAM. Ensures observability and SRE capability is fit for purpose across cloud, hosted, and end-user environments. Vendor & Cost Management Drives cloud and vendor cost discipline — manages …/CD infrastructure requirements. Head of Infrastructure & Cloud — Direct report. Hosting strategy, cloud platform, and FinOps execution. Head of SRE — Direct report. Observability, on-call, and DR/BCP processes. Head of Internal Services — Direct report. Internal IT, business systems, and end-user support. Finance — Direct report. Cloud cost visibility ...

Platform Modernisation Lead

Hiring Organisation
Adecco
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
£800 - £900/day
role demands strong leadership and a strategic mindset as you define and embed our cloud operating model, aligning it with change and release management, observability, monitoring, alerting, and support processes. Key Responsibilities: Lead the design and implementation of a hybrid multi-cloud container platform across Azure, AWS, and GCP. Ensure … corporate governance processes, standards, and tooling. Define and embed a robust cloud operating model that aligns with organisational change and release management. Develop observability, monitoring, and alerting strategies to ensure operational excellence. Maintain end-to-end accountability for platform production readiness, ensuring it meets enterprise standards. Support and enable ...

Senior Developer - ~Perm - Birmingham

Hiring Organisation
INFUSED SOLUTIONS LIMITED
Location
Birmingham, West Midlands, United Kingdom
Employment Type
Permanent
Salary
£80,000
recurring technical problems and implementing long-term solutions. Improving platform reliability, resilience, and overall product quality. Performing application profiling, performance tuning, and optimisation. Enhancing observability, monitoring, alerting, and diagnostic capabilities. Working with engineering teams to improve development practices and technical standards. Reducing technical debt and identifying opportunities for platform improvement. … Strong communication skills and the ability to collaborate effectively across engineering teams. Desirable Experience working on SaaS platforms or cloud-based applications. Exposure to observability and monitoring tools. Experience with performance profiling and optimisation techniques. Knowledge of scalability, resilience, and reliability engineering principles. Familiarity with CI/CD pipelines ...

Senior Full stack Developer - Birmingham - Perm,

Hiring Organisation
INFUSED SOLUTIONS LIMITED
Location
Birmingham, West Midlands, United Kingdom
Employment Type
Permanent
Salary
£80,000
recurring technical problems and implementing long-term solutions. Improving platform reliability, resilience, and overall product quality. Performing application profiling, performance tuning, and optimisation. Enhancing observability, monitoring, alerting, and diagnostic capabilities. Working with engineering teams to improve development practices and technical standards. Reducing technical debt and identifying opportunities for platform improvement. … Strong communication skills and the ability to collaborate effectively across engineering teams. Desirable Experience working on SaaS platforms or cloud-based applications. Exposure to observability and monitoring tools. Experience with performance profiling and optimisation techniques. Knowledge of scalability, resilience, and reliability engineering principles. Familiarity with CI/CD pipelines ...

DevSecOps Capability Manager

Hiring Organisation
WRK DIGITAL LTD
Location
Skipton, North Yorkshire, Yorkshire, United Kingdom
Employment Type
Permanent
improvement Strategy, Governance & Technical Direction Set DevSecOps strategy across pipelines and security automation Establish governance for CI/CD, IaC, and cloud delivery Define observability standards (SLOs, tracing, dashboards) Embed security into pipelines (SAST, SCA, DAST, secrets, IaC scanning) Govern "Golden Path" templates and adoption Operational Oversight & Risk Management Oversee …/CD, DevSecOps, and security integration Strong cloud, containerisation, and IaC knowledge Proven ability to improve DORA and engineering performance metrics Experience with observability and monitoring frameworks Strong background in security tooling (SAST, SCA, DAST, scanning tools) Solid understanding of cloud security, IAM, and zero-trust principles Experience working ...

Digital Senior Full Stack Engineer

Hiring Organisation
Leeds Building Society
Location
Leeds, West Yorkshire, Yorkshire, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£75,000
services. You'll lead complex technical delivery, champion modern engineering practices and help shape high-quality solutions through clean architecture, automation, CI/CD, observability and secure-by-default development. Just as importantly, you'll coach and mentor other engineers, raise standards across the squad and define ways of working. … leading code/design reviews; uplifting test automation and quality gates. Ability to influence stakeholders across Product, Architecture, InfoSec, Risk and Operations; governance experience. Observability experience: metrics, logs, traces; operational ownership of services. Experience of supporting UI/UX Design would be beneficial And in return ...

Senior AI Engineer

Hiring Organisation
Adria Solutions Ltd
Location
Manchester, United Kingdom
Employment Type
Permanent
Salary
£75000 - £110000/annum
solutions securely within enterprise environments. Ensure solutions leverage Private Endpoints, secure networking, identity management, and enterprise-grade governance controls. Establish monitoring, evaluation, and observability frameworks for AI systems, including hallucination detection, model drift monitoring, performance tracking, and cost optimisation. Partner with operational and commercial stakeholders to identify high-value … evaluation. Experience applying Data Science methodologies to solve complex business problems and identify opportunities for AI adoption. Experience with GenAIOps, LLMOps, MLOps, and AI observability platforms. Exposure to Computer Vision, OCR, Voice AI, Conversational AI, or multimodal AI solutions. Experience working within operational, retail, automotive, logistics, or customer-centric organisations. ...

Enterprise Data Architect: AI-Ready Lakehouse & Governance

Hiring Organisation
Jobleads-UK
Location
Fleet, England, United Kingdom
Quantios is a leading provider of software solutions for the trust administration and corporate services industry. With over 30 years of experience, we empower our clients with innovative technology that enhances governance, operations, and investment ...

Senior Automation Engineer

Hiring Organisation
Raytheon
Location
Glenrothes, Fife, Scotland, United Kingdom
Employment Type
Permanent, Work From Home
commissioning of robotic cells and assembly systems; perform First Article Inspection (FAI) and ensure compliance with safety standards i.e. ISO 9001 or AS9100. Observability & Support: Maintain platform observability and respond to incidents through Root Cause Analysis (RCA) to improve service efficiency. System Integration: Designing and implementing interfaces between MES (e.g. ...

Splunk Lead Engineer

Hiring Organisation
VIQU IT
Location
London, Bishopsgate, United Kingdom
Employment Type
Contract
Contract Rate
£550 - £700/day Inside IR35
client a leading finance house are looking for a Lead Splunk Engineer to take the lead in the design and implementation of monitoring and observability patterns and standards within the Observability Team. This role will act as a technical authority, ensuring best practices are followed, automation first approach is taken … mentoring the team to build sustainable capability, advocate monitoring and observability best practice to the wider technology domain. For this opportunity you will have proven skills in: · Attention to detail with the ability to craft concise, informational user documentation · Experience of researching and developing solutions that expand, modernise or improve ...

Site Reliability Engineer (SRE)

Hiring Organisation
Randstad Technologies Recruitment
Location
Nationwide, United Kingdom
Employment Type
Contract
Contract Rate
£55 - £60/hour
Title: Lead Site Reliability Engineer (SRE) - Observability Location: Reading, UK/Hybrid & Remote Options About the Role We are looking for a Lead SRE to design, scale, and operate massive-scale observability systems that keep our global services online and performant. You will join an autonomous team of software engineers … Thanos/Cortex, Kafka, the ELK stack, Ansible, or Consul . Comfortable diving into unfamiliar codebases and participating in an on-call rotation. Keywords: Observability, Monitoring, SRE, Site Reliability Engineering, DevOps, ElasticSearch, ELK, Prometheus, Kafka, Terraform, Linux, Bare Metal Randstad Technologies is acting as an Employment Business in relation ...

AI Engineer

Hiring Organisation
Hyre AI Limited
Location
Paddington, Warrington, United Kingdom
Employment Type
Permanent
Salary
GBP 60,000 - 80,000 Annual
tool-calling patterns Extend the MCP server with new tools and capabilities Enforce structured outputs and validation across LLM boundaries 2. LLM Quality, Evals & Observability Build the layer that lets the team ship LLM features with confidence. You will: Design and grow the eval platform - golden datasets, regression suites … judge Integrate observability and tracing across providers and prompt versions Track cost, latency, and quality per prompt, model, and client Build guardrails for prompt injection, PII, and output safety Drive prompt engineering practice - versioning, A/B testing, platform overlays 3. Cloud & Data Infrastructure Own the cloud substrate that runs ...

AI Engineer

Hiring Organisation
Hyre AI Limited
Location
City of Westminster, Greater London, Paddington, United Kingdom
Employment Type
Permanent
Salary
£60000 - £80000/annum Plus Equity
tool-calling patterns Extend the MCP server with new tools and capabilities Enforce structured outputs and validation across LLM boundaries 2. LLM Quality, Evals & Observability Build the layer that lets the team ship LLM features with confidence. You will: Design and grow the eval platform - golden datasets, regression suites … judge Integrate observability and tracing across providers and prompt versions Track cost, latency, and quality per prompt, model, and client Build guardrails for prompt injection, PII, and output safety Drive prompt engineering practice - versioning, A/B testing, platform overlays 3. Cloud & Data Infrastructure Own the cloud substrate that runs ...

Cloud Platform Engineer - AWS SRE

Hiring Organisation
Impellam
Location
Glasgow, Lanarkshire, Scotland, United Kingdom
Employment Type
Contractor
Contract Rate
Salary negotiable
rapid incident response, service restoration, root cause analysis, and operational automation. The ideal candidate will have hands-on experience with AWS infrastructure, Snowflake operations, observability tooling, and on-call support in production environments. Key responsibilities: Lead incident triage and resolution for AWS and Snowflake services; monitor alerts, dashboards, and service … Required skills: Strong knowledge of AWS services such as EC2, S3, IAM, VPC, Lambda, and CloudWatch; experience with Snowflake administration and troubleshooting; familiarity with observability tools such as CloudWatch, Datadog, Grafana, or Splunk; understanding of SRE concepts including SLIs, SLOs, error budgets, and incident management; and scripting or automation skills ...

Operations Engineer

Hiring Organisation
Ascent Resourcing Limited
Location
Birmingham, West Midlands, England, United Kingdom
Employment Type
Full-Time
Salary
£55,000 - £60,000 per annum
continuity. Key Responsibilities Provide operational support for enterprise platforms, applications, integrations, and associated technologies. Monitor system health, availability, and performance using monitoring, alerting, and observability tools. Analyse, troubleshoot, and resolve incidents affecting services and platforms. Perform root cause analysis and contribute to implementing permanent solutions to prevent recurring issues. Coordinate … within IT operations, support engineering, or service management environments. Experience supporting business-critical production services and operational platforms. Knowledge of monitoring, logging, alerting, and observability practices. Experience working with incident, problem, change, and release management processes. Excellent communication skills with the ability to collaborate effectively across multiple technical and business ...

Lead AI Engineer

Hiring Organisation
Capco
Location
Borough of Tameside, United Kingdom
Employment Type
Full Time
LLMs and multi-modal models at scale Strong engineering background in Python with proven backend and API development skills Solid understanding of scalable MLOps, observability, and cloud-native AI deployment Excellent communication, problem-solving, and project management skills in agile environments Bonus Points For Experience with agentic frameworks (e.g., LangChain … LlamaIndex) Experience in deep learning frameworks and front-end development Familiarity with Langfuse, Langsmith, or other LLM observability tools Understanding of Model Context Protocol and bias/hallucination mitigation techniques Previous success in integrating GenAI solutions into enterprise-scale systems Why Join Capco Deliver high-impact technology solutions for Tier ...

Senior Director, Data and Information Marketplace

Hiring Organisation
Jobleads-UK
Location
Cambridge, England, United Kingdom
ensuring intuitive experiences for both people and agents across discovery, access, sharing, understanding and use. Advise the development of capabilities for data access, lineage, observability and quality so that data assets are transparent, trusted and usable at scale. Shape enterprise approaches to data, information and knowledge lifecycle management, embedding governance … seamless experiences that are widely adopted by users and machines across multiple enterprise business units. Deep expertise in relevant capability areas, including data quality, observability, lineage, access management, lifecycle management and information governance. Measurable evidence of optimising the data P&L across covering cost/FinOps, value realisation and sustainability. ...