Observability Jobs in the UK

Employment Type

Remote Jobs

Hybrid/Remote 425

Sort By

Relevance
Date

Locations

Job Titles

AI Architect

City of London, London, United Kingdom

Tata Consultancy Services

If you need support in completing the application or if you require a different format of this document, please get in touch with at UKI.recruitment@tcs.com or call TCS London Office number 02031552100/+44 204 520 2575 with the More ❯

Posted: 4 days ago

AI Architect

London Area, United Kingdom

Tata Consultancy Services

Posted: 4 days ago

AppSec Lead

South East, United Kingdom

Halian Technology Limited

A leading fintech company is seeking aLead AppSec Engineerto join their established team. Youll be instrumental in embedding security into every stage of the software development lifecycleguiding engineers, shaping best practices, and driving secure, scalable solutions across our platform. Key More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 6 days ago

Senior Software Engineer

United Kingdom
Hybrid/Remote Options

Bezos

At Bezos, our vision is to Deliver Happiness : for our team, for the end consumers, for the e-commerce sellers, and for our logistics partners. Exciting times in e-commerce : E-commerce sales are driven by consumers who increasingly buy More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 18 days ago

Senior Agentic AI Engineer

Birmingham, England, United Kingdom

Method Resourcing

act? This is a chance to design and deliver agentic AI systems on Azure that automate real business workflows through tool use, retrieval, and reasoning, with the reliability and observability of true production engineering. In this position you’ll take ownership of designing and scaling end-to-end agentic solutions on Azure, combining LLMs, APIs, and orchestration frameworks to deliver … Productionise on Azure using AI Foundry/OpenAI, Azure ML, Functions, Event Grid/Service Bus, and Kubernetes. Build LLMOps pipelines for evaluation, monitoring, safety, and cost control. Define observability standards across prompts, tools, and data flows. Establish governance patterns, safety, privacy, and auditability. Stay hands-on with critical code paths while guiding architecture and best practice. 🧠Required Skills/ More ❯

Posted: Yesterday

Azure AI Engineer

United Kingdom
Hybrid/Remote Options

Cognitive Group | Part of the Focus Cloud Group

Functions, Logic Apps, and APIs to orchestrate data and AI workflows. Design and deliver retrieval-augmented generation (RAG) and Copilot-style assistants embedded into business and web applications. Embed observability and monitoring into AI and data pipelines, tracking performance, quality, and cost. Collaborate with data scientists, architects, and product teams to turn prototypes into enterprise-ready AI services . Stay … or equivalent for building and extending web or Power Apps solutions. Knowledge of Azure DevOps, CI/CD, and Infrastructure-as-Code (Bicep, Terraform). Deep appreciation of governance, observability, and secure design principles . More ❯

Posted: 2 days ago

Linux Production Engineer

London Area, United Kingdom

Autonomai Recruitment

experience building technology 0→1 , owning systems end-to-end, and working close to the metal. They will operate across everything from bare-metal Linux to modern build and observability stacks . Linux Platform Engineer – Trading Infrastructure Overview The firm is seeking a Linux Platform Engineer to join a small, high-impact engineering group supporting ML/AI-driven trading. … latency . Contribute to kernel-level debugging and system improvements . Automate Linux fleet builds—creating consistent, reproducible systems . Manage Kubernetes cluster infrastructure, networking, and container orchestration. Enhance observability Analyze and optimize networking across the full TCP/IP stack . Investigate core dumps, memory bottlenecks, and CPU performance issues across distributed systems. Develop Python tooling for internal automation More ❯

Posted: 4 days ago

Linux Production Engineer

City of London, London, United Kingdom

Autonomai Recruitment

Posted: 4 days ago

Head of Platform Engineering

Surrey, England, United Kingdom
Hybrid/Remote Options

La Fosse

scale. This is a pivotal, visible role reporting directly to the CTO. The Opportunity You’ll shape the operational strategy and modernise how the platform is managed, driving reliability, observability, automation, and cost efficiency. You’ll manage the Head of DevOps and work closely with Product, Engineering and Finance to ensure the platform is secure, resilient, scalable and commercially efficient. … IT, Security and Platform Operations Reliability, performance and availability of a cloud-native SaaS platform (AWS/serverless) Cost-to-Serve ownership and cloud cost visibility/optimisation Maturing observability, incident management & operational governance Uplifting DevOps engineering practices and platform automation, use of AI Vendor & outsourced IT partner management Supporting a high-change organisation scaling for enterprise success What You More ❯

Posted: 4 days ago

Director - Performance and Reliability

Cambridgeshire, England, United Kingdom

Sanderson

lead performance testing and chaos engineering initiatives, and embed reliability best practices across engineering, DevOps, and infrastructure teams. This is a senior, strategic leadership role focused on system excellence, observability, and continuous improvement. Ideal Candidate: Proven experience leading Performance Engineering, Reliability, or SRE functions Deep expertise in performance testing methodologies (load, stress, spike, soak) Strong hands-on background with LoadRunner … strategy across critical platforms and services Oversee load, stress, and chaos testing initiatives to ensure systems perform and recover under real-world conditions Define and drive best practices for observability, monitoring, and APM adoption using tools like Dynatrace Drive incident reduction, faster recovery (MTTR) , and continuous reliability improvements Champion a culture of performance ownership , ensuring teams build with scalability, stability More ❯

Employment Type: Full-Time

Salary: £84,000 - £95,000 per annum, Negotiable, Inc benefits

Posted: 27 days ago

Director - Performance and Reliability

Cambridgeshire, East Anglia, United Kingdom

Sanderson Recruitment

Employment Type: Permanent

Salary: £95,000

Posted: 27 days ago

Software Developer

Hammersmith, England, United Kingdom

OpenSource

data sources using advanced web scraping and reverse-engineering techniques. Developing and maintaining low-latency, real-time data feeds to support internal systems and strategies. Improving internal visibility and observability tooling to help diagnose integration issues and identify improvements. Contributing across the full lifecycle of your work — design, development, testing, review, deployment, and ongoing support. Working within an agile, flexible … a rotational basis. Tech Stack Languages: Python (3.10+), plus TypeScript/JavaScript for frontend work, and occasional Go for infrastructure tasks. Messaging: RabbitMQ, Kafka Storage: PostgreSQL, Redis Environment: Linux Observability: OpenTelemetry, Prometheus, Grafana, Zabbix Requirements Must-haves Strong software development experience, especially with Python. A degree in Computer Science or another numerical discipline. Clear communication skills, able to explain technical More ❯

Posted: 4 days ago

Software Developer

Alderley Edge, Cheshire, United Kingdom

Transunion

Day You'll Be: Design and build reliable backend systems and infrastructure tooling Use TDD to write high-quality, maintainable code and build out automated test suites Own reliability, observability, and performance of key services Collaborate with clients to understand requirements, debug issues, and propose solutions Drive improvements to system architecture, automation, and deployment processes Mentor junior developers and contribute … in writing and on calls Desirable Skills & Experience: Experience owning backend systems in production environments Experience with Cloud Platforms AWS or GCP Infrastructure-as-code, CI/CD, and observability tooling Experience scaling systems under sustained load Contributions to internal tooling or open source Experience with large datasets and machine learning models Impact You'll Make: What's In It More ❯

Employment Type: Permanent

Posted: 3 days ago

Senior DevSecOps engineer

England, United Kingdom
Hybrid/Remote Options

Seccl Technology Limited

handling, JWK publishing, and SSO connection setup. Utilising Infrastructure as Code (Terraform) and CI/CD (GitHub Actions) to manage Auth0 configuration and ensure safe, repeatable deployments. Implementing comprehensive observability for authentication paths with structured logs, monitoring dashboards, alerts, and SLOs. Collaborating closely with product, engineering, and support teams on migration timelines, communications, and incident response. This role's for … and identity configurations, including secure secrets management. Solid understanding of core AWS services relevant to modern authentication patterns, such as API Gateway, Lambda authorisers, and CloudWatch. A commitment to observability, with hands-on experience implementing structured logging, dashboards, and SLOs for critical services. Excellent collaboration skills, demonstrated through participation in design reviews, pairing, and writing clear technical documentation (e.g., runbooks More ❯

Employment Type: Permanent

Salary: GBP Annual

Posted: 16 days ago

Senior DevSecOps engineer

Edinburgh, Midlothian, United Kingdom
Hybrid/Remote Options

Seccl Technology Limited

Employment Type: Permanent

Salary: GBP Annual

Posted: 16 days ago

Senior DevSecOps engineer

Bath, Somerset, United Kingdom
Hybrid/Remote Options

Seccl Technology Limited

Employment Type: Permanent

Salary: GBP Annual

Posted: 16 days ago

Azure Consultant - Presales

United Kingdom
Hybrid/Remote Options

Hancock & Parsons Ltd

A well established software company are seeking an Azure Consultant to join their team! This is a hybrid technical and pre-sales role that involves a mix of customer engagement and being hands-on with Azure. If you're someone More ❯

Posted: 2 days ago

Head of Cloud – Contract (Outside IR35)

Derby, England, United Kingdom
Hybrid/Remote Options

Experis UK

Head of Cloud – Contract (Outside IR35) Location: Hybrid (East Midlands/London 1-2 days/week onsite) Rate: Up to £700/day Contract Type: Outside IR35 Duration: 3-6 months (initial), with potential extension Start Date: ASAP About More ❯

Posted: 3 days ago

Staff Data Engineer

London, United Kingdom
Hybrid/Remote Options

Fruition Group

Job Title: Staff Data Engineer Location: London, Hybrid Salary: c.£140,000 + bonus + share options Why Apply? This is a unique opportunity to take a leading role in shaping the data strategy of a fast growing Insurtech scale More ❯

Employment Type: Permanent

Posted: 30+ days ago

Staff Site Reliability Engineer - Observability

London Area, United Kingdom
Hybrid/Remote Options

Motive Group

Senior/Staff Site Reliability Engineer - Observability | London (Hybrid) If you care deeply about building and operating world-class infrastructure for AI at scale , this one’s worth your time. We’re working with a company that builds the backbone powering some of the most demanding AI workloads on the planet. Think large-scale GPU clusters, global telemetry systems, and … distributed training environments used by leading research and enterprise teams. They’re looking for a Senior or Staff SRE with deep experience in observability at massive scale - someone who’s tuned Prometheus/Mimir, Loki, or Tempo clusters beyond 100M+ series or 10TB/day logs, and who thrives in highly technical, fast-moving environments. You’ll be working on … Designing and scaling observability for globally distributed GPU infrastructure Building automation that cuts operational toil and improves reliability Partnering with platform and infrastructure teams to deliver true visibility across complex AI systems If you’ve built or operated telemetry stacks for large-scale, GPU-heavy, or multi-tenant environments - and want to work on cutting-edge problems in a business More ❯

Posted: 2 days ago

Staff Site Reliability Engineer - Observability

City of London, London, United Kingdom
Hybrid/Remote Options

Motive Group

Posted: 2 days ago

Platform Engineer

England, United Kingdom
Hybrid/Remote Options

Harnham

AND RESPONSIBILITIES Support and enhance the company's infrastructure and production systems across GCP. Contribute to a major replatforming project to GKE, ensuring scalability, automation, and security. Improve reliability, observability, and CI/CD pipelines (GitLab CI, Argo, Flux). Work closely with developers to embed best practices and elevate the internal developer experience (IDP). Collaborate within a small … with GCP (Google Cloud Platform) . Deep understanding of CI/CD pipelines - GitLab CI, Argo, or Flux. Experience with HashiCorp Vault and open-source tooling. Background in automation, observability, and platform reliability . Excellent problem-solving skills and a collaborative, pragmatic mindset. THE DETAILS Day rate: £550-£650/day (Outside IR35) Contract: 3 months, with scope for extension … AND RESPONSIBILITIES Support and enhance the company's infrastructure and production systems across GCP. Contribute to a major replatforming project to GKE, ensuring scalability, automation, and security. Improve reliability, observability, and CI/CD pipelines (GitLab CI, Argo, Flux). Work closely with developers to embed best practices and elevate the internal developer experience (IDP). Collaborate within a small More ❯

Posted: Today