376 to 400 of 484 Observability Jobs in England

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
North East, Glasgow, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

DevOps Engineer

Hiring Organisation
Adria Solutions
Location
Manchester, North West, United Kingdom
Employment Type
Contract, Work From Home
Contract Rate
£400 per day (Outside)
partial service) Google Cloud Platform (BigQuery for analytics) DevOps & CI/CD GitHub repositories & workflows Shared GitHub Actions pipelines Public and private repositories Monitoring & Observability Prometheus, Grafana, Alertmanager Logit.io StatusCake Azure Monitor Alerts Sentry (service-level monitoring) What Were Looking For Strong hands-on experience with Azure DevOps tooling … Kubernetes (AKS preferred) Experience with CI/CD pipelines (GitHub Actions) Familiarity with multi-cloud environments (AWS/GCP beneficial) Experience with monitoring and observability tools Ability to work in a collaborative, fast-paced environment Contract Details Start Date: ASAP Duration: Initial 3 months Location: Remote/Hybrid If youre ...

Senior Data Platform Engineer

Hiring Organisation
ed Resourcing Ltd
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 90,000 Annual
Snowflake platform components Building and maintaining Infrastructure as Code (Terraform) across environments Creating and optimising CI/CD pipelines (Azure DevOps, GitHub Actions) Implementing observability practices (logging, monitoring, alerting) Ensuring platform security, scalability, and performance Collaborating with architects and senior engineers on platform standards Mentoring engineers and promoting engineering best … experience with Terraform or similar IaC tooling Proven ability to build and manage CI/CD pipelines Solid understanding of cloud security and observability Scripting skills (PowerShell, Bash, Python) Strong communicator with experience working across teams Ideal Backgrounds Platform Engineers working in data environments DevOps/Platform Engineers with exposure ...

Azure Engineering Manager - Fully Remote

Hiring Organisation
GBV Ltd
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
engineers. What youll be doing: Leading a distributed engineering team focused on platform reliability and scalability Driving SRE best practices (SLOs, automation, observability, incident management) Partnering with product, security, and engineering teams to shape infrastructure strategy Improving CI/CD, developer experience, and system performance Championing a culture of continuous … teams Deep Azure expertise (Terraform/IaC preferred) Background in software engineering (C#, Java, Python, or Ruby) Experience with Kubernetes, CI/CD, and observability tooling Passion for automation, reliability, and scalable systems Package highlights: Salary around £110-130k Private healthcare, pension, and strong benefits Clear progression and development ...

Platform Engineering Manager

Hiring Organisation
Prism Digital
Location
London Area, United Kingdom
cloud environments Architecture governance and design authority Security-by-design and Zero Trust Terraform or Bicep (production IaC) CI/CD and infrastructure automation Observability (SLOs, monitoring, incident management) Disaster recovery and resilience planning Vendor and third-party management Strong stakeholder communication What You’ll Work With Azure (landing zones … shared services) Terraform/Bicep CI/CD pipelines Kubernetes (AKS) Observability tooling (logs, metrics, tracing) Networking (VNets, ExpressRoute, private endpoints) Security controls and compliance frameworks Event Hubs, Service Bus, API Management Hybrid Windows/Linux infrastructure Nice to Haves FinOps (cost control, budgeting, optimisation) Financial services or regulated environments ...

Typescript Developer

Hiring Organisation
Get2Talent
Location
South East London, London, United Kingdom
Employment Type
Permanent
Salary
£85,000
performance trading platform . Work primarily with: TypeScript ( Node. js & React) Monorepo tooling, GitHub, GitHub Actions Jest, Playwright Redis, MS SQL, WebSockets Docker, Kubernetes Observability tools ( Grafana, Prometheus, SonarQube) Take end- to- end ownership of features from design to production. Collaborate closely with platform and DevOps engineers on build pipelines … observability, and operational concerns. Communicate directly with clients to clarify requirements and propose solutions. Contribute to and improve automated testing practices. Participate in peer code reviews and maintain high engineering standards. Leverage LLM/AI- enabled development tools as part of day- to- day development. Requirements 8+ years of professional ...

Lead React Developer

Hiring Organisation
Get2Talent
Location
South East London, London, United Kingdom
Employment Type
Permanent
Salary
£85,000
performance trading platform . Work primarily with: TypeScript ( Node. js & React) Monorepo tooling, GitHub, GitHub Actions Jest, Playwright Redis, MS SQL, WebSockets Docker, Kubernetes Observability tools ( Grafana, Prometheus, SonarQube) Take end- to- end ownership of features from design to production. Collaborate closely with platform and DevOps engineers on build pipelines … observability, and operational concerns. Communicate directly with clients to clarify requirements and propose solutions. Contribute to and improve automated testing practices. Participate in peer code reviews and maintain high engineering standards. Leverage LLM/AI- enabled development tools as part of day- to- day development. Requirements 8+ years of professional ...

Full Stack Typescript Engineer

Hiring Organisation
Get2Talent
Location
Oxford, Oxfordshire, South East, United Kingdom
Employment Type
Permanent
Salary
£70,000
performance trading platform . Work primarily with: TypeScript ( Node. js & React) Monorepo tooling, GitHub, GitHub Actions Jest, Playwright Redis, MS SQL, WebSockets Docker, Kubernetes Observability tools ( Grafana, Prometheus, SonarQube) Take end- to- end ownership of features from design to production. Collaborate closely with platform and DevOps engineers on build pipelines … observability, and operational concerns. Communicate directly with clients to clarify requirements and propose solutions. Contribute to and improve automated testing practices. Participate in peer code reviews and maintain high engineering standards. Leverage LLM/AI- enabled development tools as part of day- to- day development. Requirements 8+ years of professional ...

Senior Lead Engineer

Hiring Organisation
Investigo
Location
City of London, London, United Kingdom
change management tools like Liquibase into automated pipelines Apply DevSecOps best practices across the lifecycle: static analysis, dependency scanning, and secure credential management Ensure observability, monitoring, and performance using GCP Operations Suite or New Relic Mentor engineers and collaborate across global, distributed teams What We’re Looking For Proven experience … expertise : BigQuery, Dataproc, Cloud Composer Deep data architecture and engineering knowledge : Spark, DBT, Oracle, BigQuery Experience designing scalable architectures (Microservices, Monoliths, Batch) Skilled in observability, monitoring, and DevSecOps integration Excellent communication with a record of collaborating globally Why You’ll Love It Combine architecture, coding, and leadership in one role ...

Senior / Lead Data Engineer (AI-Focused)

Hiring Organisation
PaymentGenes
Location
City of London, London, United Kingdom
inference (batch and real-time) Evaluate and integrate emerging AI tooling where strategically valuable 🔧 Technical Leadership Set best practices for testing, documentation, lineage, and observability Lead code reviews and mentor data & analytics engineers Drive CI/CD and infrastructure-as-code adoption Own platform reliability, performance optimisation, and cost efficiency … Infrastructure Feature engineering architecture ML pipeline and deployment workflows Experience supporting production ML systems Familiarity with embeddings, vector databases, LLM orchestration (desirable) Data observability and model monitoring Platform & DevOps CI/CD for data workflows Git-based engineering standards Docker/containerisation Infrastructure-as-code (e.g., Terraform) Monitoring and alerting ...

Software Engineer

Hiring Organisation
Hydrogen Group
Location
City of London, London, United Kingdom
Code Scaling and managing large fleets of IoT devices in the field Developing CI/CD pipelines and automation across the stack Implementing observability, monitoring, and telemetry (cloud + edge) Supporting security and compliance standards (e.g. SOC2, HIPAA) Improving developer workflows and engineering productivity What We Are Looking For 5+ … Docker & Kubernetes (EKS preferred) Proficiency in Python, Go, or another modern language Experience building CI/CD pipelines and automation Hands-on experience with observability tools (Grafana, Prometheus) Nice to Have Experience with IoT/edge infrastructure (device provisioning, OTA updates) Hybrid or multi-cloud environments SOC2 compliance exposure High ...

Cloud Engineer

Hiring Organisation
Spectrum It Recruitment Limited
Location
Southampton, Hampshire, South East, United Kingdom
Employment Type
Permanent
Salary
£65,000
secure, resilient cloud infrastructure across AWS and Azure . You'll play a key role in modernising platforms, migrating legacy services, and improving automation, observability and security across a multi-cloud estate. Cloud Engineer (AWS & Azure) Hybrid (2 days per month onsite) Location: Southampton What you'll be doing Designing … similar), Azure DevOps, PowerShell, Azure CLI Scripting: PowerShell, Python, Bash Containers: Docker, container registries (e.g., ACR) CI/CD: Azure DevOps Pipelines, YAML automation Observability: Datadog, Grafana Cloud, OpenTelemetry, CloudWatch, Prometheus, Loki Benefits (from day one) Up to 15% Bonus scheme 25 days annual leave + bank holidays Pension ...

Staff Engineer

Hiring Organisation
Xapien
Location
London Area, United Kingdom
workflows and MongoDB running on Kubernetes in GCP. You'll work with modern patterns including event-driven architectures, gRPC and REST APIs, and comprehensive observability with Grafana Cloud. We're an AI-native engineering team. We use Claude Code daily and we're investing heavily in AI-assisted development … team — establishing shared conventions, measuring impact, and helping engineers level up Background in SaaS platforms or B2B products at scale Expert-level knowledge of observability tools (Grafana, Prometheus, etc.) Deep understanding of authorization patterns, security, and multi-tenancy Experience with protobuf and gRPC Our Tech Stack Languages: Go Databases: MongoDB ...

Infrastructure Engineer -GCP

Hiring Organisation
Blupace Tech
Location
London Area, United Kingdom
automate infrastructure provisioning and configuration Implement automation frameworks and tooling to streamline deployments, scaling, and operational workflows Integrate and optimise monitoring, logging, and observability solutions across GCP environments Support FinOps activities including cost monitoring, usage optimisation, tagging governance, and reporting Ensure cloud environments meet security, compliance, and performance standards Troubleshoot … proficiency in Terraform and Infrastructure‐as‐Code methodologies Strong automation skills using Python, Bash, Ansible, or CI/CD pipelines Experience with monitoring and observability tools (Stackdriver, Prometheus, Grafana, Datadog, ELK, etc.) Solid understanding of FinOps principles and cloud cost‐optimisation strategies Strong knowledge of cloud networking, security, identity ...

Senior Python Engineer (£100k + benefits)

Hiring Organisation
Morson Edge
Location
Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Great opportunity for Senior Python Engineers to work remotely for a UK based AI scale-up. You'd join a large engineering department and would work within a cross functional product-based team responsible for ...

Technical Architect Principal (UK)

Hiring Organisation
Stackstudio Digital Ltd
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
GBP 500 - 550 Daily
days/week Contact Duration - 6 months The Role The Technical Architect Principal will lead the architecture, design, and technical governance of an enterprise observability and telemetry platform. This role is responsible for designing major solution components, defining reference architectures, and guiding development teams throu click apply for full ...

Hybrid Domain Consolidation Analyst | IT Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading observability solutions provider in Greater London is seeking a Domain Consolidation Analyst for a 6-month full-time contract with hybrid work. The role involves managing a project to consolidate IT domains, coordinating with third parties, and ensuring compliance with ISO 27001 standards. Candidates must have at least ...

Partner Manager

Hiring Organisation
Timebeat
Location
London Area, United Kingdom
written communication and CRM discipline Nice to have Experience with channel models (reseller, referral, MSP, SI), co-sell motions, or marketplace partnerships Familiarity with observability/monitoring, networking, infrastructure tooling, or developer-facing products Experience building partner programs from scratch (tiering, enablement, certification, MDF) Success metrics (examples) Number ...

GCP Devops Lead

Hiring Organisation
Infoplus Technologies UK Ltd
Location
Bristol, Somerset, United Kingdom
Employment Type
Permanent
Salary
GBP Annual
Actions, Harness, Jenkins). Networking & Security: Experience with GCP Cloud Armor, GCP Networking, and embedding secure-by-design controls from design to runtime. Automation & Observability: Implementing actionable observability, performance tuning, and automation to reduce toil. Defining and operating against SLOs/SLIs. Scripting & Tooling: Scripting in Bash, PowerShell, or Python. … Performance & Reliability: Define, monitor, and operate against service level objectives (SLOs/SLIs), ensuring high availability, performance, and fault tolerance. Continuous Improvement: Drive automation, observability, and performance tuning to reduce manual effort and improve platform reliability. Collaboration: Work closely with architecture and feature teams to evolve the cloud roadmap ...

Site Reliability Engineer

Hiring Organisation
McGregor Boyall Associates Limited
Location
Leeds, West Yorkshire, Yorkshire, United Kingdom
Employment Type
Permanent
Salary
£80,000
implementation across the GCP/Azure platforms. They are looking for several Site Reliability Engineer (SRE) to help improve the reliability, performance and observability of our Azure and GCP environments. You'll work within a multidisciplinary engineering squad, supporting the delivery, operation and continuous improvement of our cloud-hosted services. … Support the reliability and performance of the cloud platforms your squad owns. Use observability tools, metrics, logs and traces to detect and prevent issues. Contribute to incident response, post-incident reviews and problem management activities. Build automation that removes toil and improves operational efficiency. Work collaboratively with engineers, Product Owners ...

Lead Platform Engineer

Hiring Organisation
Revybe IT Recruitment Ltd
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£100,000 - £110,000 per annum
deeply hands-on with modern infrastructure tooling. The company builds all it's software in-house and has been investing heavily in its platform, observability, and cloud capabilities as they continue to scale. The Opportunity: You’ll join as the Lead Platform Engineer, working closely with engineering leadership to drive … currently operates in a hybrid environment: ~60% on-premise infrastructure ~40% Microsoft Azure The long-term strategy is focused on modernising the platform, improving observability, and evolving cloud capabilities, making this an excellent opportunity for someone who enjoys building and shaping systems. Tech Stack: You’ll be working across ...

Site Reliability Engineering Lead – Financial Services

Hiring Organisation
Alexander Ash Consulting
Location
London Area, United Kingdom
operations, and improvement of the SRE platforms, teams, and organisation. You will be responsible for leading and scaling the SRE function, driving intelligent automation, observability, and resilience, across the organisation, and leading on production incidents, from frameworks to resolution. You will work in a hybrid on-premise/AWS-based … related fields (platform engineering, DevOps etc.) Deep technical experience in cloud-native AWS and on-premise systems architecture Strong incident management and observability experience for large scale systems Intelligent automation/Agentic AI experience preferred Excellent AWS services, data platforms, software engineering, CI/CD, IaC, experience Degree educated ...

AWS Site Reliability Engineer ( Data Platform)

Hiring Organisation
FBI &TMT
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£450 - £455 per day
cloud-native data platform built on AWS, Snowflake, and Databricks. This role focuses on enhancing reliability through automation, disaster recovery testing, resiliency engineering, observability, and proactive SLO/SLI/SLA management. Key Responsibilities: Design, build, and maintain automation for infrastructure provisioning, platform operations, and incident response using … manage SLIs, SLOs, and SLAs for critical data pipelines and platform services; utilise error budgets to guide reliability improvements. Build and operate robust observability solutions (metrics, logs, traces, alerts) for AWS services, Snowflake, and Databricks workloads. Partner with data engineering and platform teams to embed reliability-by-design into architecture ...

AI Engineer – Production LLM Systems

Hiring Organisation
Redimeer
Location
London Area, United Kingdom
orchestration . You will work on: Multi‐agent architectures Intelligent tool and API integrations RAG pipelines and vector‐based retrieval Evaluation frameworks and AI observability Production workflows that ensure reliability, consistency, and scale You’ll play a critical role in crafting the orchestration layer that makes LLM systems trustworthy—handling … improving robustness across diverse use cases. Key Responsibilities Build production AI systems using LLMs, RAG pipelines, vector databases, and agentic frameworks Design evaluation and observability frameworks to measure performance, accuracy, and reliability Develop clean, scalable applications with proper error handling, APIs, and data pipelines Implement and maintain retrieval systems (vector ...

Principal Developer Team Lead

Hiring Organisation
Cambridge University Press & Assessment
Location
Cambridge, Cambridgeshire, United Kingdom
Employment Type
Permanent
Salary
GBP 51,400 - 68,800 Annual
legacy applications to cloud-native AWS architectures Build DevOps automation to support SRE practices Establish AI/ML development standards and frameworks Set observability, monitoring, and incident response standards Promote best practices in web, event-driven, and cloud-native technologies Provide technical expertise and oversee code reviews People Leadership Manage … more modern programming languages Experience with AWS cloud and infrastructure DevOps skills: automation, CI/CD, infrastructure-as-code Understanding of SRE and observability Experience in web-apps and modern frameworks Strong communicator with technical and non-technical audiences Technical Expertise CI/CD pipelines, automation frameworks, and developer tooling ...