401 to 425 of 496 Observability Jobs in England

Cloud SRE - Global Observability Lead (Remote UK)

Hiring Organisation
Jobleads-UK
Location
Newcastle upon Tyne, England, United Kingdom
leading technology company is seeking a Staff Site Reliability Engineer - Cloud to architect the Observability Centre of Excellence, ensuring reliability and uptime of global platforms. This role involves implementing OpenTelemetry, developing automation scripts, and optimizing platform performance while collaborating with engineering teams. Required skills include experience with observability tools like ...

Senior SRE & Observability Engineer – Trade Tech

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Bloomberg L.P. is seeking a Senior Software Engineer/SRE for the TRAX Observability team in London. This role involves enhancing systems for performance metrics, improving telemetry reliability, and collaborating with various teams across global offices. Candidates should have experience with high-level programming languages, Unix/Linux basics … observability concepts like distributed tracing and logging. Strong communication skills are essential. The position emphasizes technical growth, stakeholder influence, and a commitment to diversity and inclusion within the workplace. #J-18808-Ljbffr ...

Senior Software Engineer, Chem-Bio

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. We’re in the heart of the UK government with ...

Data Architect

Hiring Organisation
Jobleads-UK
Location
Bristol, England, United Kingdom
We believe in the power of ingenuity to build a positive human future.We challenge where it matters and own the outcome.As strategies, technologies, and innovation collide, we create opportunity from complexity. Our teams of interdisciplinary ...

Data Architect

Hiring Organisation
Jobleads-UK
Location
Manchester, England, United Kingdom
We believe in the power of ingenuity to build a positive human future.We challenge where it matters and own the outcome.As strategies, technologies, and innovation collide, we create opportunity from complexity. Our teams of interdisciplinary ...

Database Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Southampton, England, United Kingdom
tune performance across hundreds of instances. Architect Cross‐Cloud Portability: use CNPG and cloud‐native patterns to keep our database layer provider‐agnostic. Evolve Observability & Monitoring: build proactive monitoring and alerting to detect regressions before they affect customers. Support Replication & Mobility: enable data streaming and zero‐downtime migration strategies … provision infrastructure, avoiding manual implementations. Distributed Systems enthusiast: enjoy the challenge of multi‐tenant, multi‐region, multi‐cloud scenarios with rigorous data integrity. Security & Observability mindset: build deep observability (Prometheus/Grafana/OpenTelemetry/Humio) and guardrails for secure operation. Engineering via code: deliver backend services in Java with ...

Database Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
tune performance across hundreds of instances. Architect Cross‐Cloud Portability: use CNPG and cloud‐native patterns to keep our database layer provider‐agnostic. Evolve Observability & Monitoring: build proactive monitoring and alerting to detect regressions before they affect customers. Support Replication & Mobility: enable data streaming and zero‐downtime migration strategies … provision infrastructure, avoiding manual implementations. Distributed Systems enthusiast: enjoy the challenge of multi‐tenant, multi‐region, multi‐cloud scenarios with rigorous data integrity. Security & Observability mindset: build deep observability (Prometheus/Grafana/OpenTelemetry/Humio) and guardrails for secure operation. Engineering via code: deliver backend services in Java with ...

Principal Machine Learning Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
data engineering teams to implement scalable data lakehouse oriented feature architectures and enterprise‐grade ML governance. Champion engineering standards for model quality, documentation, observability, and platform resilience. Feature Engineering & Data Architecture Architect highly scalable, production‐ready feature pipelines within Lakehouse environments. Set the technical direction for fallback and resilience strategies … including scoring metrics, latency, error analytics, and SLOs. Partner with platform teams to optimise cost, scale, and reliability of inference endpoints. Monitoring, Drift Detection & Observability Define observability standards for feature drift, concept drift, performance degradation, and data integrity. Lead the creation of dashboards, benchmarks, and automated alerting across ...

Manager – Site Reliability Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
minimise business disruption and improve operational efficiency.* Risk Management & ComplianceEnsure compliance with regulatory standards and internal governance. Proactively identify and mitigate operational risks.* **Metrics & Observability**Establish and maintain robust observability practices, employing metrics, logging, and tracing to drive data-driven decisions and improve system health.* **Out of hours support/… toil, and eliminating manual operational tasks.* Excellent communication and stakeholder management skills, particularly under pressure.* Expertise in automation (Python, Shell, PowerShell etc.)* Familiarity with observability tools and practices (metrics, logging, tracing).* Ability to lead capacity planning and scalability strategies to support growth.* Knowledge of clearing and settlement processes ...

Principal AI Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
experimenting with cutting‐edge technologies. Preferred Requirements Advanced Integration - Experience integrating Salesforce with external agents via APIs and open standards (MCP, A2A). Governance & Observability - Familiarity with prompt governance, observability, monitoring frameworks, responsible AI and compliance best practices Cross‐Platform Background - Background in cross‐platform integrations (e.g., Hyperscaler SDKs ...

Principal AI Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
experimenting with cutting‐edge technologies. Preferred Qualifications Advanced Integration – Experience integrating Salesforce with external agents via APIs and open standards (MCP, A2A). Governance & Observability – Familiarity with prompt governance, observability, monitoring frameworks, responsible AI and compliance best practices. Cross‐Platform Background – Background in cross‐platform integrations (e.g., Hyperscaler SDKs ...

Senior SRE: AI-Driven Observability & Automation

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading consulting firm in London is looking for an experienced Site Reliability Engineer to enhance IT operations through observability and automation. Key responsibilities include architecting observability platforms, implementing SRE best practices, and driving AI-based automation initiatives. Ideal candidates will have over 12 years of experience in IT operations, strong ...

SRE - Contract

Hiring Organisation
Pixelcode Technologies Limited
Location
Hampshire, South East, United Kingdom
Employment Type
Contract
contract basis(INSIDE IR35) with strong expertise in Dynatrace implementation . The ideal candidate should have hands-on experience designing and deploying observability solutions across complex enterprise environments, with deep expertise in Dynatrace architecture, integrations, alerting, dashboarding, and troubleshooting distributed systems. Key Requirements 8+ years of IT experience with strong … expertise in cloud and observability solutions. Expert-level experience designing, deploying, and configuring Dynatrace in complex environments. Hands-on experience with Dynatrace integrations, alerting, dashboard creation, synthetic monitoring, and distributed tracing . Strong experience implementing enterprise-scale monitoring and observability solutions. Deep expertise in AWS services including ...

Senior Software Engineer (Node.js / TypeScript / AWS)

Hiring Organisation
Adria Solutions
Location
Manchester, North West, United Kingdom
Employment Type
Permanent
Salary
£80,000
build scalable backend services and cloud infrastructure Architect event-driven and distributed systems on AWS Develop APIs, microservices and internal tooling Improve reliability, observability and developer workflows Conduct load testing and performance optimisation Contribute to frontend applications where required About You You are a senior engineer with deep backend … driven architectures and high-concurrency systems Infrastructure as Code experience (Pulumi, Terraform or similar) Strong understanding of databases, caching and performance optimisation Experience with observability, monitoring and alerting Comfortable working across the stack when required Strong Linux, Docker and Git knowledge Not the Right Fit If Your experience is primarily ...

Senior Software Engineer for Real-Time Platforms (Node.js / TypeScript / AWS). Job in Mancheste[...]

Hiring Organisation
Jobleads-UK
Location
Manchester, England, United Kingdom
build scalable backend services and cloud infrastructure Architect event‐driven and distributed systems on AWS Develop APIs, microservices and internal tooling Improve reliability, observability and developer workflows Conduct load testing and performance optimisation Contribute to frontend applications where required About You You are a senior engineer with deep backend … driven architectures and high‐concurrency systems Infrastructure as Code experience (Pulumi, Terraform or similar) Strong understanding of databases, caching and performance optimisation Experience with observability, monitoring and alerting Comfortable working across the stack when required Strong Linux, Docker and Git knowledge Not the Right Fit Your experience is primarily frontend ...

IT Service Performance & Reliability Manager

Hiring Organisation
Spectrum It Recruitment Limited
Location
New Milton, Hampshire, South East, United Kingdom
Employment Type
Permanent
Salary
£60,000
across critical IT services. This role focuses on keeping customer-facing services fast, reliable, and fully observable, while driving continuous improvement. You will lead observability across services, ensuring effective monitoring and actionable insights. You'll manage capacity and performance through forecasting and trend analysis, identifying risks early and driving improvements. … performance in IT environments Hands-on experience with AWS and Azure Strong knowledge of ITIL v3/v4 (certification required) Experience with monitoring/observability tools (e.g. Zabbix, Grafana, Kibana, OpenSearch) Knowledge of Windows and Linux server environments Scripting skills (e.g. Python, PowerShell, Node.js) Experience integrating data via APIs, webhooks ...

Director - Principal Engineer (Java/Angular/AI)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£140,000 - £170,000 per annum
volumes of financial and transactional data Contribute directly to architecture, system design, and hands-on software development Drive engineering best practices across automation, testing, observability, and performance Build resilient, production-grade systems with a strong focus on reliability and scalability Work across the full software development lifecycle from design through … scalability, and high-availability systems Experience building automated, production-grade platforms with minimal manual intervention Familiarity with cloud-native technologies, CI/CD, and observability tooling Strong engineering mindset with a hands-on approach to development Interest in modern engineering tooling, including AI-assisted development workflows Robert Walters Operations Limited ...

Head of Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
cloud architecture, operational resilience, developer experience and infrastructure team leadership. You will be responsible for shaping the long term infrastructure roadmap, improving reliability and observability, strengthening incident response and ensuring the platform can support a growing customer base and increasingly critical product suite. This is a role for someone … platform strategy Design and evolve the AWS cloud architecture to support scale, resilience and performance Set standards across infrastructure, CI/CD, environments and observability Lead production reliability, uptime, incident response and post incident reviews Improve monitoring, alerting and on call practices to ensure they are effective and sustainable Partner ...

DevOps Technical Lead

Hiring Organisation
Data Careers
Location
South East London, London, United Kingdom
Employment Type
Permanent, Work From Home
optimise CI/CD pipelines Improve deployment reliability and reduce rollback frequency Standardise release processes across engineering teams Implement progressive delivery practices Reliability & Observability Define and track SLIs/SLOs Enhance monitoring, alerting and incident response processes Lead post-incident reviews and root cause analysis Drive reduction of operational toil … Lambda) Proven Infrastructure-as-Code experience (Terraform preferred) CI/CD tooling experience (GitHub Actions, GitLab CI, Jenkins) Experience operating production SaaS environments Strong observability tooling knowledge (Datadog, Prometheus, ELK etc.) Incident management and root cause analysis experience Experience in regulated or security-conscious environments is highly desirable ...

Platform Storage Engineer

Hiring Organisation
Ncounter
Location
East London, London, England, United Kingdom
Employment Type
Full-Time
Salary
£160,000 - £190,000 per annum
vendor storage tooling into a unified platform • Improve storage throughput, data locality and platform efficiency for research workloads • Collaborate closely with compute, networking and observability teams across the wider platform estate • Support troubleshooting, tuning and reliability engineering for production storage systems What we’re looking for: • Strong backend or systems … Rust, C++ or Java • Experience building or supporting distributed systems at scale • Strong Linux knowledge and an interest in infrastructure engineering • Exposure to observability tooling such as Prometheus, Grafana, Datadog or ELK • Understanding of cloud and infrastructure automation, ideally AWS, GCP or Terraform • Any experience with Ceph, MinIO, JuiceFS, FUSE ...

SRE Technical Lead

Hiring Organisation
F5 consultants
Location
Berkshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
engineering standards, operational maturity, and long-term platform stability. You'll work within a modern cloud-native environment leveraging Kubernetes, OpenShift, GitOps, service mesh, observability tooling, and automation-first engineering practices. This is a highly influential role where you'll lead and mentor high-performing SRE teams while remaining technically … time genuinely matter. Skills Required Strong expertise in Kubernetes and OpenShift (non-negotiable) Experience with multi-cloud and hybrid architectures Hands-on experience with observability platforms Strong Infrastructure as Code and GitOps experience Proven experience with CI/CD automation and reliability-focused engineering Demonstrated ability to lead and mentor ...

Cloud Operations Engineer

Hiring Organisation
Anson Mccade
Location
Cheltenham, Gloucestershire, South West, United Kingdom
Employment Type
Permanent
strong hands-on experience required) Kubernetes (deployment, troubleshooting, and platform support) Infrastructure as Code (Terraform or similar tools) Cloud-native networking and system troubleshooting Observability and monitoring tools APIs and integration services Secure, restricted, air-gapped cloud environments Required Experience Strong experience working with Linux-based systems in production environments … operate within highly secure cloud architectures Desirable Experience Kubernetes administration or advanced troubleshooting experience Infrastructure as Code experience (Terraform or similar) Exposure to observability and monitoring platforms Experience working in 24/7 operational environments Prior experience coordinating shifts or leading small technical teams deep expertise in secure cloud operations ...

Backend Engineering Team Lead

Hiring Organisation
Jobleads-UK
Location
Birmingham, England, United Kingdom
software lifecycle from design ideation through to production and eventual decommissioning. Our engineering teams work under a true DevOps culture - with infrastructure as code, observability, automated testing, and continuous delivery treated as first-order concerns, not afterthoughts. You’ll set architectural direction, partner closely with your Product Manager counterpart … Technical Environment Languages & frameworks: C#/.NET Cloud: Azure Architecture: Event-driven systems and microservice development (Service Bus) Engineering culture: DevOps, infrastructure as code, observability and monitoring, automated testing across all environments including production, continuous delivery Our Engineering Approach Full ownership: Teams own their solutions end-to-end - from inception ...

Backend Engineering Team Lead

Hiring Organisation
Jobleads-UK
Location
Bristol, England, United Kingdom
software lifecycle from design ideation through to production and eventual decommissioning. Our engineering teams work under a true DevOps culture - with infrastructure as code, observability, automated testing, and continuous delivery treated as first-order concerns, not afterthoughts. You’ll set architectural direction, partner closely with your Product Manager counterpart … Technical Environment Languages & frameworks: C#/.NET Cloud: Azure Architecture: Event-driven systems and microservice development (Service Bus) Engineering culture: DevOps, infrastructure as code, observability and monitoring, automated testing across all environments including production, continuous delivery Our Engineering Approach Full ownership: Teams own their solutions end-to-end - from inception ...

Backend Engineering Team Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
software lifecycle from design ideation through to production and eventual decommissioning. Our engineering teams work under a true DevOps culture - with infrastructure as code, observability, automated testing, and continuous delivery treated as first-order concerns, not afterthoughts. You’ll set architectural direction, partner closely with your Product Manager counterpart … Technical Environment Languages & frameworks: C#/.NET Cloud: Azure Architecture: Event-driven systems and microservice development (Service Bus) Engineering culture: DevOps, infrastructure as code, observability and monitoring, automated testing across all environments including production, continuous delivery Our Engineering Approach Full ownership: Teams own their solutions end-to-end - from inception ...