1 to 25 of 41 Observability Jobs in the East of England

Senior AI Engineer

Hiring Organisation
Resourcing Group
Location
Milton, Cambridgeshire, UK
foundation models via Azure OpenAI, AWS Bedrock and open-weight model endpoints (Llama, Mistral, Claude, GPT family). Applying prompt engineering, guardrails, evaluation and observability using tooling such as LangSmith, Langfuse, Ragas, promptfoo and Azure AI Content Safety. Productionising solutions with FastAPI, async Python, Pydantic, Docker and Kubernetes , deployed into ...

Site Reliability Engineer

Hiring Organisation
Anglian Water
Location
Huntingdon, Cambridgeshire, East Anglia, United Kingdom
Employment Type
Permanent
Salary
£40,000
looking for * Experience in site reliability engineering or DevOps roles * Strong scripting and automation skills (e.g., Python, Bash) * Knowledge of monitoring tools and observability practices * Understanding of cloud infrastructure and containerisation * Excellent problem-solving and analytical abilities * Commitment to continuous improvement and operational excellence Benefits As a valued employee ...

Software Engineer

Hiring Organisation
Accountancy Action
Location
Hertfordshire, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 - £90,000 per annum
Exposure to AI systems, LLMs, or conversational workflows Experience in healthcare or regulated environments Knowledge of infrastructure-as-code (e.g. Terraform, CDK) Experience with observability, monitoring, and scaling production systems ...

Senior Machine Learning Engineer - Agentic AI Platform

Hiring Organisation
Robert Half Limited
Location
Cambridge, Cambridgeshire, East Anglia, United Kingdom
Employment Type
Permanent, Work From Home
within the agent framework. Inference & Performance: Optimize LLM integration, latency, and cost efficiency. State & Reliability: Strengthen Redis-backed persistence and ensure system consistency. Evaluation & Observability: Build regression frameworks and implement monitoring and tracing. What We're Looking For Strong Python engineering experience with production-grade systems Hands-on with ...

Founding Engineer

Hiring Organisation
RedTech Recruitment Ltd
Location
Cambridge, Cambridgeshire, East Anglia, United Kingdom
Employment Type
Permanent
Salary
£95,000
develop high-quality frontend interfaces that make complex AI outputs intuitive and actionable for users Build and maintain deployment pipelines, testing frameworks, monitoring, and observability systems Design and implement secure data pipelines with appropriate access controls and auditability Ensure the platform meets enterprise-grade security and compliance requirements ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Colchester, Essex, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Peterborough, Cambridgeshire, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Luton, Bedfordshire, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Ipswich, Suffolk, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Norwich, Norfolk, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Domain Architect

Hiring Organisation
TALENT INTERNATIONAL UK LTD
Location
Hemel Hempstead, Hertfordshire, UK
ADRs). Key Requirements: Technical Foundation: Solid experience with Microservices, Distributed Systems, and Cloud-native design. Architecture Principles: Understanding of DDD, resilience patterns, and observability (logging/tracing). Reliability: Familiarity with SLO/SLI concepts and availability modeling. Bonus: Knowledge of the energy sector or trading platforms. Essential Skills ...

Lead Site Reliability Engineer SRE Azure SaaS

Hiring Organisation
Client Server
Location
Cambridge, Cambridgeshire, United Kingdom
Employment Type
Permanent
Salary
GBP 100,000 Annual
Lead Site Reliability Engineer (SRE Azure SaaS) Cambridge/WFH to £100k Do you have expertise with observability and monitoring within a SaaS environment? You could be progressing your career in a hands-on, influential role at a global InsurTech business, working on a flagship product that has recently been ...

Principal Engineer

Hiring Organisation
Synergetic Recruitment Group Limited
Location
Chelmsford, Essex, United Kingdom
Employment Type
Permanent
Salary
GBP 100,000 Annual
client is scaling a large, distributed cloud platform and is looking for a Principal Engineer to act as the Subject Matter Expert (SME) across observability and cloud infrastructure. Youll be working at serious scale managing thousands of Kubernetes nodes, handling tens of terabytes of logs daily, and supporting millions ...

Principal Engineer

Hiring Organisation
Synergetic Recruitment Group Limited
Location
Chelmsford, Essex, South East, United Kingdom
Employment Type
Permanent
client is scaling a large, distributed cloud platform and is looking for a Principal Engineer to act as the Subject Matter Expert (SME) across observability and cloud infrastructure. Youll be working at serious scale managing thousands of Kubernetes nodes, handling tens of terabytes of logs daily, and supporting millions … highly distributed environment. The Role This is a senior, hands-on role where you will own the technical direction and standards of the observability ecosystem. As the SME, youll define best practice, guide architectural decisions, and act as the go-to expert across engineering teams, ensuring scalable, cost-efficient ...

Principal Engineer

Hiring Organisation
Synergetic
Location
Cambridgeshire, England, United Kingdom
client is scaling a large, distributed cloud platform and is looking for a Principal Engineer to act as the Subject Matter Expert (SME) across observability and cloud infrastructure. You’ll be working at serious scale managing thousands of Kubernetes nodes, handling tens of terabytes of logs daily, and supporting millions … highly distributed environment. The Role This is a senior, hands-on role where you will own the technical direction and standards of the observability ecosystem. As the SME, you’ll define best practice, guide architectural decisions, and act as the go-to expert across engineering teams, ensuring scalable, cost-efficient ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Stevenage, Hertfordshire, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Ipswich, Suffolk, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Basildon, Essex, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Cambridge, Cambridgeshire, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Luton, Bedfordshire, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

Site Reliability Engineer (Security Cleared)

Hiring Organisation
Profile 29
Location
Norwich, Norfolk, UK
performant infrastructure that underpins critical public-sector services. Youll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. Youll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready. … using Terraform Build and operate scalable infrastructure in Amazon Web Services (AWS) Design, implement, and maintain robust CI/CD pipelines Improve system reliability, observability, performance, and security Implement monitoring, logging, and alerting solutions Troubleshoot production incidents and perform root cause analysis Collaborate with development teams to improve application resilience ...

DevSecOps Security Engineer - AWS, Security

Hiring Organisation
Adecco
Location
Cambridge, Cambridgeshire, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £100,000 per annum
prioritisation.* Partner with engineering teams to resolve issues efficiently and pragmatically.* Refine detection tooling by tuning logic and reducing unnecessary or inaccurate alerts.Operational Readiness & Observability* Strengthen visibility across systems through improved log pipelines, alerting pathways, and monitoring strategies.* Contribute to updating response guidelines, runbooks, and incident-handling materials.* Support initiatives … Kubernetes Security, Infrastructure as Code, Terraform, CloudFormation, Pipeline Security, Cloud Governance, Policy as Code, Secrets Management, Identity and Access Management, Vulnerability Remediation, Threat Detection, Observability, Logging, Automation Engineering, Python, Bash, Zero Trust, Security Hardening, Cloud Monitoring, Least Privilege, Compliance Automation, Security Orchestration About AdeccoAdecco is acting as an Employment Agency. ...

Lead Site Reliability Engineer SRE Azure SaaS

Hiring Organisation
Client Server
Location
Cambridge, Cambridgeshire, East Anglia, United Kingdom
Employment Type
Permanent, Work From Home
Lead Site Reliability Engineer (SRE Azure SaaS) Cambridge/WFH to £100k Do you have expertise with observability and monitoring within a SaaS environment? You could be progressing your career in a hands-on, influential role at a global InsurTech business, working on a flagship product that has recently been … happy to mentor and coach others, sharing your SRE expertise with software engineers and DevOps You have a strong knowledge of Azure including observability, monitoring, scaling, security and Azure DevOps pipelines You have experience with observability tools, Datadog preferred You have a good knowledge of automation, scripting (Python or PowerShell ...

Performance Engineer

Hiring Organisation
Morson Edge
Location
Norwich, Norfolk, UK
combines deep technical capability with the ability to enable and coach multiple engineering teams. You'll own performance strategy, enhance internal performance tooling, improve observability, and help shape the next generation of AI-assisted performance analysis. The Opportunity You'll act as the central performance engineering specialist across multiple product … internal performance platform. The environment is mature in performance testing, so the focus is on taking things to the next level through automation, innovation, observability improvements, and AI-driven tooling. You'll work closely with engineering teams, tech leads, and platform teams to improve system performance, reduce manual effort ...

Performance Engineer

Hiring Organisation
Morson Edge
Location
Cambridge, Cambridgeshire, UK
combines deep technical capability with the ability to enable and coach multiple engineering teams. You'll own performance strategy, enhance internal performance tooling, improve observability, and help shape the next generation of AI-assisted performance analysis. The Opportunity You'll act as the central performance engineering specialist across multiple product … internal performance platform. The environment is mature in performance testing, so the focus is on taking things to the next level through automation, innovation, observability improvements, and AI-driven tooling. You'll work closely with engineering teams, tech leads, and platform teams to improve system performance, reduce manual effort ...