676 to 700 of 819 Observability Jobs in the UK

Performance and Monitoring Engineer

Hiring Organisation
Solus Accident Repair Centres
Location
Stansted, Birchanger, Essex, United Kingdom
Employment Type
Permanent
Salary
£40000 - £50000/annum
talented Performance and Monitoring Engineer to help us strengthen the stability, reliability and performance of our systems. If you're passionate about monitoring, observability and using data to proactively improve service health, this is a great opportunity to make a real impact across a large, modern technology estate. Responsibilities … improve speed, accuracy and consistency Supporting major changes, deployments and post-incident reviews with data-driven evidence Qualifications Strong experience with monitoring and observability tools (LogicMonitor, Azure Monitor, App Insights, Log Analytics, Defender for Cloud) Excellent understanding of cloud performance, IaaS/PaaS, networking fundamentals, API performance and capacity modelling ...

AI Engineering Manager

Hiring Organisation
Gravitas Recruitment Group (Global) Ltd
Location
London, UK
business needs into technical deliverables. Drive agentic workflows and AI tooling adoption across the product development lifecycle to deliver tangible value. Establish robust evaluation, observability, and quality practices for AI systems, balancing speed with reliability. Guide teams through ambiguity and rapid change, making pragmatic decisions and removing blockers. Measure success … development. Hands-on experience with AI models, tools, and frameworks, including agent orchestration, prompt engineering, RAG pipelines, evaluation frameworks, LangChain, Codex, Claude, Gemini, and observability tools and best practices. Strong technical problem-solving skills and the ability to guide teams through ambiguous, fast-changing environments. Excellent communication skills across technical ...

AI Engineering Manager

Hiring Organisation
Gravitas Recruitment Group (Global) Ltd
Location
London Area, United Kingdom
business needs into technical deliverables. Drive agentic workflows and AI tooling adoption across the product development lifecycle to deliver tangible value. Establish robust evaluation, observability, and quality practices for AI systems, balancing speed with reliability. Guide teams through ambiguity and rapid change, making pragmatic decisions and removing blockers. Measure success … development. Hands-on experience with AI models, tools, and frameworks, including agent orchestration, prompt engineering, RAG pipelines, evaluation frameworks, LangChain, Codex, Claude, Gemini, and observability tools and best practices. Strong technical problem-solving skills and the ability to guide teams through ambiguous, fast-changing environments. Excellent communication skills across technical ...

Gen AI Architect - London, UK

Hiring Organisation
Capgemini
Location
Greater London, United Kingdom
Employment Type
Full Time
production-grade AI systems using Amazon Bedrock, retrieval-augmented generation (RAG), agentic workflows, and cloud-native AWS services. Drive architecture standards, model orchestration, governance, observability, and operational excellence across the GenAI lifecycle while collaborating with engineering, security, compliance, and business stakeholders Hybrid working: The places that you work from … customization, prompt orchestration, retrieval pipelines, and agentic workflows Design agentic AI systems incorporating tool use, workflow orchestration, memory management, and autonomous decision flows Implement observability for prompts, model responses, vector retrieval quality, and agent execution workflows Integrate GenAI capabilities into enterprise applications, APIs, workflow platforms, and data ecosystems Work with ...

AI Engineering Manager

Hiring Organisation
Gravitas Recruitment Group (Global) Ltd
Location
City of London, Greater London, UK
business needs into technical deliverables. Drive agentic workflows and AI tooling adoption across the product development lifecycle to deliver tangible value. Establish robust evaluation, observability, and quality practices for AI systems, balancing speed with reliability. Guide teams through ambiguity and rapid change, making pragmatic decisions and removing blockers. Measure success … development. Hands-on experience with AI models, tools, and frameworks, including agent orchestration, prompt engineering, RAG pipelines, evaluation frameworks, LangChain, Codex, Claude, Gemini, and observability tools and best practices. Strong technical problem-solving skills and the ability to guide teams through ambiguous, fast-changing environments. Excellent communication skills across technical ...

IT&D Director, IT Operations

Hiring Organisation
Reckitt
Location
Gujarat, United Kingdom
Employment Type
Full Time
predictive, and autonomous global IT operations model. This role is accountable for delivering next‐generation operational excellence by harnessing AI Ops, automation, and advanced observability to ensure resilient, high‐performing technology services across all regions. Working within an outsourced operations model, the Director provides strategic leadership to govern partners, optimise … operations strategy built on AI Ops/Automation first principles, shifting the organisation from reactive operations to predictive, autonomous IT operations. Drive full-stack observability across infrastructure, networks, cloud platforms, security, and applications using AI‐powered analytics and automated telemetry. Ensure 24/7 operational resilience by leveraging machine learning ...

Senior Software Engineer - Python/AWS

Hiring Organisation
Lunio
Location
London, UK
break down ambiguity, coordinate across Product/Design/Data to land outcomes. Raise the quality bar: define practical standards for testing, security, and observability, act as approver on critical PRs, model excellent reviews and pairing. Operate and improve production: own service performance targets for your area, lead incident response … simple. Technical Execution & Delivery: Leads execution across multiple stories/engineers, breaks down ambiguous problems, and delivers predictably with sensible trade-offs. Testing, Reliability & Observability: Bakes in testability, defines/uses service performance targets, alerts, logs, and traces, advocates for reliability alongside features. Security & Privacy: Applies secure-by-default patterns ...

Senior Software Engineer - Python/AWS

Hiring Organisation
Lunio
Location
City of London, London, United Kingdom
break down ambiguity, coordinate across Product/Design/Data to land outcomes. Raise the quality bar: define practical standards for testing, security, and observability, act as approver on critical PRs, model excellent reviews and pairing. Operate and improve production: own service performance targets for your area, lead incident response … simple. Technical Execution & Delivery: Leads execution across multiple stories/engineers, breaks down ambiguous problems, and delivers predictably with sensible trade-offs. Testing, Reliability & Observability: Bakes in testability, defines/uses service performance targets, alerts, logs, and traces, advocates for reliability alongside features. Security & Privacy: Applies secure-by-default patterns ...

Senior Software Engineer - Python/AWS

Hiring Organisation
Lunio
Location
City of London, Greater London, UK
break down ambiguity, coordinate across Product/Design/Data to land outcomes. Raise the quality bar: define practical standards for testing, security, and observability, act as approver on critical PRs, model excellent reviews and pairing. Operate and improve production: own service performance targets for your area, lead incident response … simple. Technical Execution & Delivery: Leads execution across multiple stories/engineers, breaks down ambiguous problems, and delivers predictably with sensible trade-offs. Testing, Reliability & Observability: Bakes in testability, defines/uses service performance targets, alerts, logs, and traces, advocates for reliability alongside features. Security & Privacy: Applies secure-by-default patterns ...

Head of Platforms - Technology, Infrastructure and Operations

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
code generation, testing, and automation. Drive adoption of AI‐enabled engineering practices. Ensure secure and efficient‐by‐default platform services through automation. Ensure reliability, observability, and cost efficiency of platform services. Define resilience, incident management, and operational models. Track and report on platform maturity and performance. Partner with Business Unit … developer experience. Demonstrated stakeholder influence across complex organizations. Experience leading distributed engineering teams. Familiarity with AI‐enabled engineering practices. Strong grounding in SRE, observability, and secure‐by‐design. Excellent communication and leadership skills. Success Measures Increased developer productivity and satisfaction. Adoption of platform capabilities across engineering teams. Reduction in toil ...

Staff Software Engineer, AI Reliability Engineering

Hiring Organisation
Jobleads-UK
Location
England, United Kingdom
Responsibilities Develop appropriate Service Level Objectives for large language model serving systems, balancing availability and latency with development velocity. Design and implement monitoring and observability systems across the token path. Assist in the design and implementation of high-availability serving infrastructure across multiple regions and cloud providers Lead incident response … more ML hardware accelerators (GPUs, TPUs, Trainium). Understand ML-specific networking optimizations like RDMA and InfiniBand. Have expertise in AI-specific observability tools and frameworks. Have experience with chaos engineering and systematic resilience testing. Have contributed to open-source infrastructure or ML tooling. The annual compensation range for this ...

Solutions Architect (New Relic/Snowflake)

Hiring Organisation
FDM Group
Location
London, South East, England, United Kingdom
Employment Type
Contractor
Contract Rate
£80,000 per annum, Negotiable, Pro-rata, Inc benefits
Hybrid role based in London. Our client, a major UK retailer, is undergoing a significant integration transformation — building out a centralised platform and observability capability as part of a wider initiative to modernise how integration works across the business. The successful candidate will sit within the IAA team and play … role in supporting the design and build-out of the platform, working closely with the observability team to shape the technical direction and underpin the delivery backlog. This is a hands-on architecture role that combines design ownership with close collaboration across engineering and delivery teams. Responsibilities Support the design ...

Dynatrace Expert

Hiring Organisation
Norton Blake
Location
Basildon, Essex, UK
Dynatrace Expert/Observability CoE Lead Location: Basildon, UK (Fully Onsite) Daily Rate: £350 – £360 per day (Maximum) Contract Duration: Initial contract until July 2027 Start Date: Immediate Agency: Norton Blake About the Role Norton Blake is partnering with a leading global technology consultancy to recruit a contract Dynatrace Expert. … This position is for a major, long-term initiative establishing a unified, EMEA-wide Observability Centre of Excellence (CoE) for a world-renowned consultancy. This unique contract role blends strategic leadership with hands-on technical execution. You will act as a core member of the newly formed CoE—defining enterprise ...

Dynatrace Expert

Hiring Organisation
Norton Blake
Location
Basildon, England, United Kingdom
Dynatrace Expert/Observability CoE Lead Location: Basildon, UK (Fully Onsite) Daily Rate: £350 – £360 per day (Maximum) Contract Duration: Initial contract until July 2027 Start Date: Immediate Agency: Norton Blake About the Role Norton Blake is partnering with a leading global technology consultancy to recruit a contract Dynatrace Expert. … This position is for a major, long-term initiative establishing a unified, EMEA-wide Observability Centre of Excellence (CoE) for a world-renowned consultancy. This unique contract role blends strategic leadership with hands-on technical execution. You will act as a core member of the newly formed CoE—defining enterprise ...

Network Reliability Specialist

Hiring Organisation
Ncounter
Location
East London, London, England, United Kingdom
Employment Type
Full-Time
Salary
£160,000 - £180,000 per annum
processes, and preventing incidents before they occur. Working across data centre, enterprise, and cloud environments, you will take ownership of the tooling, automation, and observability capabilities that allow the wider business to operate with confidence. Key Responsibilities • Build and enhance network observability, monitoring, and alerting frameworks across critical infrastructure • Develop … production networks where uptime and reliability are critical • Hands-on experience with network automation using Python, Ansible, or similar technologies • Strong knowledge of monitoring, observability, and alerting platforms • Experience building operational tooling, automation frameworks, or reliability-focused engineering solutions • Understanding of network security principles and secure infrastructure practices • Experience with ...

C Engineer (Real-Time Full Tick Re-platform

Hiring Organisation
Hays
Location
London, United Kingdom
Employment Type
Contract
Your new role You'll step in as a Senior C++ Engineer, leading the design and rollout of observability across real-time, latency-sensitive platforms. This is a hands-on role where you'll shape how customer experience, system reliability, and operational insight are measured end-to-end. … engineering teams to embed metrics, tracing, logging, profiling, and telemetry into high-performance C++ systems. What you'll need to succeed Deep experience in observability engineering - metrics, tracing, logging, profiling, telemetry pipelines. Strong C++ engineering background in real-time or high-performance environments. Understanding of customer-experience measurement ...

Lead AI Architect

Hiring Organisation
MicroTECH Global Ltd
Location
Berkshire, England, United Kingdom
Employment Type
Full-Time
Salary
£119,000 - £120,000 per annum
neuro-symbolic AI concepts or hybrid reasoning architectures. Experience designing transparent, inspectable, or explainable AI methods. Practical experience with agentic reasoning evaluation, testing, benchmarking, observability, or failure analysis. Full-stack web development experience, including backend APIs and frontend application development. Technical Skills Strong Python engineering skills. Experience with modern … knowledge management, decision support, research automation, legal/financial/technical analysis, or complex operational workflows. Experience with production-grade AI system design, including observability, monitoring, testing, security, latency, cost control, and reliability. Familiarity with human-in-the-loop systems, provenance tracking, workflow auditability, or regulated environments. Experience integrating LLMs ...

Programme Director - Connectivity

Hiring Organisation
GSK
Location
Greater London, United Kingdom
Employment Type
Full Time
accountable for end‐to‐end programme delivery across multiple integrated workstreams (WAN/SD‐WAN, campus LAN/Wi‐Fi, cloud connectivity, security, DDI, observability and the managed services operating model). The role drives structured governance, disciplined execution and measurable benefits realisation, delivering change with minimal disruption across … dependency and change control) with clear decision points and escalation paths Drive cross‐workstream integration and sequencing across WAN, campus, security/segmentation, DDI, observability and MSP transition Own benefits tracking and value realisation, including cost reduction, performance/SLA improvements, resilience metrics and adoption of target standards Lead ...

VP, Marketing Operations and Agentic AI Systems

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
aligned to revenue targets and approved at exec level. Set the standard for what constitutes a production-grade agent at Sidetrade: evaluation suites, guardrails, observability, and retirement criteria. Advise the CMO and CEO on emerging AI capabilities and their commercial implications for pipeline and revenue. Lead delivery of production agents … quality parity or better. Own prompt libraries, tool definitions, and evaluation suites that govern agent quality across the team. Stand up and own the observability layer: cost per task, accuracy, drift, human escalation rate. Report the performance of the agentic layer monthly to the exec team and quarterly ...

VMware Architect

Hiring Organisation
Church International Ltd
Location
United Kingdom
Employment Type
Contract
Contract Rate
GBP Annual
VMware Aria Operations/VMware Cloud Foundation Architect to lead the design, optimisation, and strategic evolution of a large-scale enterprise monitoring and observability platform. The successful candidate will provide architectural leadership across monitoring, logging, reporting, automation, and capacity management functions, working closely with operational and technical teams to drive … time contract engagement commencing in mid-July. Key Responsibilities Act as the technical authority for VMware Aria Operations, Aria Operations for Logs, and associated observability platforms. Design and implement enterprise monitoring, logging, alerting, and reporting strategies. Translate high-level business and technical requirements into scalable architectural solutions. Define and build ...

Principal SRE Engineer / Grafana Specialist - (Outside IR35)

Hiring Organisation
17918
Location
Bristol, Gloucestershire, United Kingdom
Lead SRE/Observability Engineering Lead - (Outside IR35 Contract/Remote) Location: Bristol/London HQ - Largely Remote (Occasional Travel) Day Rate: Outside IR35 - £700 p/d Duration: 3-6 Months Initial - with intention to extend Payment Terms: Monthly Our client is a FTSE100 Wealth/Asset Management firm … seeking to engage a Lead SRE Engineer (Observability SME) to support the implementation and ins... WKCL1_UKTJ ...

Director of Sales EMEA

Hiring Organisation
Primis
Location
United Kingdom
Director of Sales, EMEA (GTM Leadership) London Series C AI Observability Every enterprise AI team deploying LLMs and ML models at scale faces the same uncomfortable reality: once a model goes into production, it becomes a black box. Hallucinations, drift, bias, performance degradation - without the right tooling, you're flying … blind Our client is the platform that fixes that. An AI observability and LLM evaluation layer trusted by the world's most sophisticated ML teams - helping them monitor, debug, and improve AI systems in production. Think of it as the observability stack for the AI era, in the same ...

Director of Sales EMEA

Hiring Organisation
Primis
Location
United Kingdom, UK
Director of Sales, EMEA (GTM Leadership) London Series C AI Observability Every enterprise AI team deploying LLMs and ML models at scale faces the same uncomfortable reality: once a model goes into production, it becomes a black box. Hallucinations, drift, bias, performance degradation - without the right tooling, you're flying … blind Our client is the platform that fixes that. An AI observability and LLM evaluation layer trusted by the world's most sophisticated ML teams - helping them monitor, debug, and improve AI systems in production. Think of it as the observability stack for the AI era, in the same ...

Senior Software Engineer - Systematic Trading Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
execution systems and support live trading workflows Build tooling for real-time monitoring, diagnostics, performance attribution, and post-trade analysis Improve system design, scalability, observability, operational resilience, and maintainability Preferred Qualifications 5+ years of hands-on software engineering experience designing, building and operating systems with high complexity. Prior experience … data, research, back testing, execution, monitoring, and post-trade analysis. Strong system design skills and sound engineering judgment. High standards for correctness, reliability, testing, observability, and maintainability For more information about DRW's processing activities and our use of job applicants' data, please view our Privacy Notice at https:/ ...

Data Engineering Manager

Hiring Organisation
Skyscanner
Location
Greater London, United Kingdom
Employment Type
Full Time
search, social and programmatic. In other words, not just dashboards... but decisions. Along the way, you'll help evolve our data platform, improving scalability, observability and governance within a modern cloud environment. You'll partner across Marketing, Product, Analytics and Data Science to turn complex data into clear, actionable direction. … Partnering cross-functionally: You'll work closely with Marketing, Product, Analytics and Marketing Technology to shape and deliver the data roadmap. Improving data quality & observability: You'll champion reliable, trustworthy datasets with strong SLAs and clear monitoring. Balancing speed and sustainability: You'll navigate the trade-offs between rapid delivery ...