326 to 350 of 493 Observability Jobs in England

Senior Software Engineer II - Data Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ensure technical consistency.* Design, develop, and maintain generative AI services and reusable components using Python.* Define and promote best practices in engineering, including scalability, observability, testing, and CI/CD.* Contribute to system designs spanning multiple services and modules, aligning with architectural best practices.* Collaborate with product, platform, and research … work collaboratively across functions in an Agile or Kanban environment.**Nice to have:*** Experience operationalizing LLMs or building an internal AI platform.* Familiarity with observability practices (metrics, logging, alerts).* Exposure to knowledge graphs or semantic search systems.Join our team and contribute to a culture of innovation, collaboration, and excellence. ...

IT Service Performance & Reliability Manager

Hiring Organisation
Spectrum It Recruitment Limited
Location
New Milton, Hampshire, UK
Employment Type
Full-time
across critical IT services. This role focuses on keeping customer-facing services fast, reliable, and fully observable, while driving continuous improvement. You will lead observability across services, ensuring effective monitoring and actionable insights. You'll manage capacit... LFWQ1_UKTJ ...

Database Reliability Engineer | Postgres, Kubernetes & Cloud

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading UK fintech company seeks data professionals to modernize its database systems. Roles involve enhancing PostgreSQL and Kubernetes setups, establishing observability through monitoring tools, and ensuring data integrity across multi-cloud frameworks. Working hybrid, the ideal candidates are innovative and collaborative, with strong backgrounds in backend development and infrastructure provisioning. ...

SRE Engineer: High-Availability, Hybrid – London

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
approach. The candidate will work with Product Engineering teams to manage high availability and uptime, utilizing a code-first approach. Responsibilities include implementing observability standards, managing incidents, and optimizing platform costs. Offering a competitive salary of £85,000 - £90,000, the role is ideal for those passionate about reliability ...

Data Platform Solution Architect

Hiring Organisation
Jobleads-UK
Location
Basildon, England, United Kingdom
Design Documents (ADDs)*** Deep understanding of **cloud-native design patterns*** Experience in **performance tuning** across:* Snowflake* Airflow* Iceberg* Focus on **platform reliability, scalability, and observability*** Experience designing and operating **data platforms** in production environments #J-18808-Ljbffr ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Jobleads-UK
Location
Bromley, England, United Kingdom
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Site Reliability Engineer - BACLJP

Hiring Organisation
Huxley Associates
Location
Yorkshire, United Kingdom
Employment Type
Contract
Contract Rate
GBP 600 Daily
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Huxley Associates
Location
Bromley, London, South Yorkshire, United Kingdom
Employment Type
Contract
Contract Rate
£600/day
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Head of Network Services

Hiring Organisation
G.R.E. Recruitment Limited
Location
Cirencester, Gloucestershire, South West, United Kingdom
Employment Type
Permanent
Salary
£75,000
WLAN, and SD-WAN Network security and segmentation OT/BMS/industrial networking environments Routing, switching, and IP design Monitoring, logging, and observability High availability and resilience design The company have been established for 17 years and have over 100 employees, they have added 25 new employees ...

Senior Network Architect, GPU Fabric and AI Infrastructure

Hiring Organisation
We Love Alfa
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 180,000 - 240,000 Annual
directly impact customer training workloads. This person will own network architecture across GPU fabric, InfiniBand, RoCE v2, Ethernet leaf spine, edge connectivity, peering, observability, deployment standards and operational handover. We are looking for someone who has: Deep GPU cluster or HPC deployment experience Strong InfiniBand production experience RoCE v2 experience ...

Senior Infrastructure Architect

Hiring Organisation
ALFA TECHNOLOGY RECRUITMENT LTD
Location
City of London, London, United Kingdom
Employment Type
Temporary
directly impact customer training workloads. This person will own network architecture across GPU fabric, InfiniBand, RoCE v2, Ethernet leaf spine, edge connectivity, peering, observability, deployment standards and operational handover. We are looking for someone who has: Deep GPU cluster or HPC deployment experience Strong InfiniBand production experience RoCE v2 experience ...

Head of Network Services

Hiring Organisation
G.R.E. Recruitment Limited
Location
Cirencester, Gloucestershire, UK
Employment Type
Full-time
WLAN, and SD-WAN Network security and segmentation OT/BMS/industrial networking environments Routing, switching, and IP design Monitoring, logging, and observability High availability and resilience design The company have been established for 17 years and have over 100 employees, they have added 25 new employees ...

BDR Language Speaker

Hiring Organisation
Pareto
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£30,000 - £35,000 per annum
must speak Filipino fluently to qualify for this role* Our client is a global data platform that helps turn data into action for Observability, IT, Security and more. Leaders in their field, our client is growing at an exciting rate and as such are now looking for new bi-lingual ...

Principal Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
cross‐team delivery of strategic outcomes; fewer blockers; consistent adherence to standards Developer experience and reliability Partner with Platform to uplift CI/CD, observability, SLOs and incident learning; advocate for fitness functions and paved roads Improved flow and stability metrics; faster, safer releases; measurable DX improvements People leadership Mentor ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
South East London, London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£65,000
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
London, UK
Employment Type
Full-time
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Engineer - Site Reliability Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Write automation to scale systems sustainably, prevent service issues, or when they occur, quickly recover service.* Partner with development teams to improve system reliability, observability, and release velocity.* Participate in on-call rotations, incident response, postmortems, and root cause analysis and resolution.* Be a vocal advocate of strong/sound … qualifications* Minimum 8-10 years in the industry* Experience on DevOps concepts and way of working* Experience with algorithms and data structures.* Experience in Observability practices with logging, metrics, tracing, and alerting.* Experience with Infrastructure as Code.* Understanding of identity and access management, and application security.We use Datadog and BigPanda ...

Principal Site Reliability Engineering Expert Director

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
shaping how reliability, automation, and operational excellence are engineered across the organisation. Operating across domains including traditional infrastructure, cloud engineering, network operations, identity, observability, security, AI-driven operations, and automated data workflows, the role focuses on designing scalable systems, reusable engineering patterns, and standardised controls that reduce operational toil, improve … first, measurable, and repeatable practices. A key part of the role is building and evolving reusable CI/CD and Terraform modules, engineering guardrails, observability patterns, and automation frameworks that can be adopted across multiple teams and domains without requiring each team to solve the same problems independently. The Principal ...

Site Reliability Engineer (SRE)

Hiring Organisation
Pertemps Reading
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£45,000
platform automation, CI/CD, and developer tooling. This is a hands-on role split between supporting engineers and building scalable infrastructure, automation, and observability solutions. Youll work closely with the Head of Technology and engineering teams to improve reliability, developer experience, and platform performance. What Youll Be Doing Developer … Build reusable Terraform modules and manage infrastructure-as-code standards Develop internal tooling, automation scripts, self-service tooling, and platform improvements Own and improve observability across monitoring, dashboards, alerting, and runbooks Identify opportunities to automate manual processes and improve platform reliability Contribute to scalable, maintainable, and secure infrastructure practices What ...

AWS Platform Architect

Hiring Organisation
Oscar Associates (UK) Limited
Location
Birmingham, West Midlands, United Kingdom
Employment Type
Permanent
platform architecture and modernisation roadmap, including migration from a Java monolith to microservices on EKS. Define standards for containers, runtime environments, observability, tenancy, security, and infrastructure automation. Lead SRE practices including SLI/SLOs, incident management, DR/BCP planning, post-mortems, and operational resilience. Own platform security, secure SDLC … networking, KMS, RDS, and multi-account architecture. Hands-on Kubernetes, CI/CD, Terraform, and cloud security experience. Strong understanding of SRE, observability, incident response, and disaster recovery. Experience operating within regulated environments such as ISO 27001, SOC 2, or GxP. Comfortable balancing strategic leadership with hands-on operational delivery. ...

AI Native Software Engineer

Hiring Organisation
Skilliantech Ltd
Location
London, United Kingdom
Employment Type
Contract
implement AI agents, including: Retrieval (RAG) Orchestration workflows Tool/function invocation Policy-based routing Build evaluation frameworks for accuracy, latency, and reliability Implement observability and monitoring for agent lifecycle AI Platform Integration Integrate with AI providers (e.g., OpenAI, Anthropic, Google Vertex, open-source models) Build abstraction layers to support … production (agents, RAG, orchestration) Proficiency in Python, Java, or similar backend languages Experience with: CI/CD pipelines Infrastructure as code Monitoring and observability tools Hands-on experience with AI platforms (OpenAI, Claude, Vertex AI, or similar) Preferred Experience Experience with agent frameworks (e.g., LangGraph, AutoGen, CrewAI) Experience designing multi ...

Forward Deployed Engineers

Hiring Organisation
Randstad Digital
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
£450 - £500 per day + Inside IR35
implement AI agents, including: Retrieval (RAG) Orchestration workflows Tool/function invocation Policy-based routing Build evaluation frameworks for accuracy, latency, and reliability Implement observability and monitoring for agent lifecycle AI Platform Integration Integrate with AI providers (e.g., OpenAI, Anthropic, Google Vertex, open-source models) Build abstraction layers to support … production (agents, RAG, orchestration) Proficiency in Python, Java, or similar backend languages Experience with: CI/CD pipelines Infrastructure as code Monitoring and observability tools Hands-on experience with AI platforms (OpenAI, Claude, Vertex AI, or similar) Preferred Experience Experience with agent frameworks (e.g., LangGraph, AutoGen, CrewAI) Experience designing multi ...

Cloud Engineer

Hiring Organisation
Spectrum IT Recruitment
Location
Southampton, Hampshire, United Kingdom
Employment Type
Permanent
Salary
£55000 - £65000/annum 15% Bonus
Terraform to automate and standardise infrastructure delivery. You'll support the migration and modernisation of traditional infrastructure into cloud services. You'll improve monitoring, observability, security and resilience across cloud platforms. You'll work with engineering, infrastructure and business teams to turn requirements into practical cloud solutions. You'll contribute … teams and wider stakeholders Useful: Cloud migration experience Azure DevOps and YAML pipelines PowerShell, Python or Bash scripting Docker or containerised environments Monitoring and observability tooling Experience in regulated or customer-critical environments Why apply? This is a good opportunity for a Cloud Engineer who wants to work on meaningful ...

Senior Platform Engineer

Hiring Organisation
AJ Bell
Location
Salford, Lancashire, England, United Kingdom
Employment Type
Full-Time
Salary
Competitive salary
evolving our core engineering platforms, including: Backstage and internal developer portal capabilities Engineering data platforms, including ELT workflows, DBT and SQL-based data pipelines Observability and monitoring Grafana platforms Internal automation and workflow platforms that support software delivery and engineering operations You’ll also contribute to broader platform engineering initiatives … Strong understanding of cloud platforms, containerisation and infrastructure as code Experience building self-service tooling, templates and developer enablement capabilities Experience with monitoring and observability Good understanding of security best practices in software delivery and platform design Strong problem-solving, communication and collaboration skills Ability to provide technical leadership, mentor ...

Senior Software Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Hasura and AWS Serverless Technologies such as Lambda, DynamoDB and EventBridge - all managed via AWS CDK & SST. We use Sentry, Lumigo and LogRocket for observability and Github Actions for automated testing and deployment. End-to-end Ownership. You will be entrusted with end-to-end ownership of your projects. From … having a high impact . You've spearheaded the engineering of critical systems before, working with best-in-class tooling in AWS, IaaC, observability & quality assessments. You want to discover the best ways to bring this to an early-stage startup. You know what good can look like . ...