Observability Jobs in the East of England

21 of 21 Observability Jobs in the East of England

Senior Infrastructure Automation Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
automation scripts (Python, Bash, Shell) and tools (GitLab, Terraform, Vault, Ansible) to streamline deployment, monitoring, and management processes using Infrastructure as Code (IaC). Implement and integrate monitoring and observability solutions, like AIOps, for proactive system issue detection and response. Participate in on-call rotations to ensure 24/7 system availability. Maintain detailed documentation of infrastructure, processes, and procedures More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Remote Senior Site Reliability Engineer Manager (Remote)

Cambourne, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Remotestar
strong track record of building and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with cloud platforms such as AWS, Azure, or GCP, including infrastructure as code tools like Terraform or CloudFormation. Strong scripting More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software and Platform Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
and Maintain Backstage - Design, build, and maintain custom and community-backed Backstage plugins to support Arm's engineering teams. Including CI/CD pipelines, service scaffolding, documentation, testing, and observability integrations. Collaborate Across Engineering & IT - Partner closely with platform, software and hardware teams to integrate services, tooling, and policies into the portal in a user-centric and automated manner. We More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Azure DevOps Engineer

Suffolk, England, United Kingdom
Sanderson
preferred Mastery of Git and version control workflows in a collaborative team Comfortable in Agile/Scrum environments using tools like Jira and Confluence Experience supporting production systems, including observability and monitoring Desirable: Experience in regulated industries (e.g., financial services, healthcare) Working knowledge of MongoDB or similar document-oriented databases Familiarity with Golang or Python to support infrastructure tooling Microsoft More ❯
Employment Type: Full-Time
Salary: £75,000 - £85,000 per annum
Posted:

Principal Solutions Architect

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
Java or Python. Deep understanding of AWS or other cloud providers (e.g. GCP, Azure). Strong understanding of key security technologies and protocols such as TLS, OAuth and SPIFFE. Observability, alerting, metrics collection and visualisation (e.g. Prometheus, Grafana, Elasticsearch, Dynatrace). "Nice To Have" Skills and Experience: We would be even more impressed if you are passionate about the following More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Staff Software and Platform Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
and Maintain Backstage - Design, build, and maintain custom and community-backed Backstage plugins to support Arm's engineering teams. Including CI/CD pipelines, service scaffolding, documentation, testing, and observability integrations. Collaborate Across Engineering & IT - Partner closely with platform, software and hardware teams to integrate services, tooling, and policies into the portal in a user-centric and automated manner. We More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Full Stack Software Engineer (.Net / React) (Remote)

Cambourne, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Remotestar
end tests. Ability to write and understand design documentation using C4, sequence diagrams and workflows. Excellent problem-solving skills and attention to detail. Solid understanding of logging, monitoring and observability to understand if software is functioning as required. Strong communication and teamwork skills. Preferred Skills: Experience with cloud platforms e.g., AWS, Azure, Google Cloud. Knowledge of DevOps practices and CI More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior DevOps Engineer

Bar Hill, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Domino Group
work day to day. That could include CI/CD pipelines including builds and testing - particularly automated testing - as well as issue tracking, source code management, binary artifact management, observability, and business continuity measures. It's an agile environment here at Domino: we've adopted Kanban, and most other teams use Scrum. For context, some of the technology we're More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Technical Product Delivery Manager

Watford, Hertfordshire, United Kingdom
Hybrid / WFH Options
Wickes
continuous improvement and team velocity. You'll have a deep understanding of modern cloud ecosystems, with extensive hands-on experience in Amazon Web Services (AWS). Familiarity with modern observability concepts and tools, including Datadog, and proven experience with the "platform as a product" model and driving adoption of internal tools. Strong familiarity with CI/CD principles and pipelines More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Monitoring & Observability Engineer

Lakenheath, Suffolk, United Kingdom
Computacenter AG & Co. oHG
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Gearset Limited
changes quickly and safely. We live and breathe this approach ourselves: we release new versions of Gearset multiple times a day and we continually invest in improving our own observability and infrastructure tools. This means we can identify and react to issues quickly and delight our users by getting improvements to them as fast as possible. As a product-driven More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Licensing & Tools Applications Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
and attitude on automating common repetitive tasks A suitable sense of ownership and responsibility in driving tasks to timely full completion "Nice To Have" Skills and Experience: AIOps and Observability Meaningful experience in a distributed team Working in a sophisticated, multi-geography, engineering services environment! Providing technical support and mentoring to othe Accommodations at Arm At Arm, we want to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

Hemel Hempstead, Hertfordshire, United Kingdom
Hybrid / WFH Options
Eckoh
a secure, highly available, PCI-compliant AWS platform that underpins Eckoh's mission-critical services. As a senior member of the team, you will drive improvements in platform reliability, observability, and operational excellence. You will collaborate closely with development teams to enable secure, automated delivery of services while championing DevSecOps principles. This role offers the chance to shape the future … secure PCI-compliant cloud platform on AWS to support enterprise-grade applications and services. Architect and operate production workloads with a focus on high availability, scalability, and resilience. Drive observability and monitoring improvements across infrastructure and services to proactively identify issues. Promote and embed a security-first, DevSecOps culture, ensuring best practices are followed at every stage of the software … Strong knowledge of CI/CD pipelines and automation tooling (Gitlab experience preferable). Experience with "infrastructure as code" (Terraform, CloudFormation), containerisation (Docker), and orchestration (Kubernetes). Proficiency with observability and monitoring solutions (e.g., CloudWatch, Prometheus, Grafana, Splunk). Strong understanding of cloud-native development practices and agile ways of working. Confident conducting peer code reviews and providing constructive technical More ❯
Employment Type: Permanent
Salary: £80000/annum
Posted:

Platform Engineer

Hemel Hempstead, Hertfordshire, South East, United Kingdom
Hybrid / WFH Options
Eckoh PLC
a secure, highly available, PCI-compliant AWS platform that underpins Eckoh's mission-critical services. As a senior member of the team, you will drive improvements in platform reliability, observability, and operational excellence. You will collaborate closely with development teams to enable secure, automated delivery of services while championing DevSecOps principles. This role offers the chance to shape the future … secure PCI-compliant cloud platform on AWS to support enterprise-grade applications and services. Architect and operate production workloads with a focus on high availability, scalability, and resilience. Drive observability and monitoring improvements across infrastructure and services to proactively identify issues. Promote and embed a security-first, DevSecOps culture, ensuring best practices are followed at every stage of the software … Strong knowledge of CI/CD pipelines and automation tooling (Gitlab experience preferable). Experience with 'infrastructure as code' (Terraform, CloudFormation), containerisation (Docker), and orchestration (Kubernetes). Proficiency with observability and monitoring solutions (e.g., CloudWatch, Prometheus, Grafana, Splunk). Strong understanding of cloud-native development practices and agile ways of working. Confident conducting peer code reviews and providing constructive technical More ❯
Employment Type: Permanent, Work From Home
Salary: £80,000
Posted:

Senior DevOps and SRE Engineer

Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Arm Limited
collaborate across teams to: Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and … TypeScript, Python). Validated experience operating distributed systems at scale in production. Cloud AWS (primary), Kubernetes (future), Docker (current), Terraform. Excellent debugging skills across network, systems, and data stack. Observability tooling, e.g. custom metrics pipelines, OpenTelemetry tracing, or integrations across telemetry stacks. Security engineering and practical understanding of IAM hardening, zero-trust network principles, and secrets management in data-heavy More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Technical Programme Manager

Cambridge, Cambridgeshire, East Anglia, United Kingdom
Hybrid / WFH Options
La Fosse
infrastructure platform with AI-operable capabilities Oversee key infrastructure components such as data centre expansion, programmable compute, and software-defined network/storage Enable automation-first delivery models with observability, self-healing, and policy-driven control Implement and mature GitOps workflows, IaC pipelines, and CI/CD processes across engineering teams Lead programme governance, risk management, and stakeholder engagement Partner More ❯
Employment Type: Contract
Rate: £750 - 950 per day
Posted:

Head of Delivery Enablement

Watford, Hertfordshire, England, United Kingdom
Method Resourcing
function integrated throughout the software development lifecycle. Partnering closely with product and engineering teams, you will help scope and estimate strategic work, align on tooling, and drive improvements in observability, automation, and testing. Ideal Experience & Skills Demonstrated technical leadership across diverse skillsets, including Site Reliability Engineering (SRE), DevOps, and Quality Assurance (QA) Proven track record of aligning and integrating cross More ❯
Employment Type: Full-Time
Salary: £90,000 - £95,000 per annum
Posted:

AI / Machine Learning

Cambridge, Cambridgeshire, United Kingdom
So Code Limited
new AI/ML methods Deployment and serving of models at scale Infrastructure automation and cloud-native design Responsible AI, LLM safety, and interpretability tooling Data pipelines, versioning, and observability in production A glimpse of roles we recruit for: AI Research Scientist Machine Learning Engineer Data Engineer with ML experience Applied Scientist/Research Engineer DevOps for AI/AI More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Cloud Engineer

Cambridge, Cambridgeshire, United Kingdom
So Code Limited
designing, implementing and maintaining robust AWS-based infrastructure environments, with a strong focus on automation, security and reliability. This includes supporting CI/CD processes, infrastructure as code, and observability tooling across multiple environments. Key Deliverables Design and operate AWS infrastructure including EKS, networking and storage Build infrastructure using Terraform with a focus on best practices and reusability Manage cloud … storage design Deep understanding of GitLab CI/CD workflows Solid troubleshooting capabilities across distributed and containerised systems Comfortable with infrastructure release management within a secure SDLC Proficient with observability tooling and cloud-native monitoring stacks Contract Details Outside IR35 Initial 6-month engagement More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Full-Stack Software Engineering Lead

Cambridge, Cambridgeshire, United Kingdom
Eclipse Automation Inc
portals, dashboards, internal tools, and web applications. Collaborate closely with DevOps on CI/CD pipelines, deployment workflows, infrastructure, and SecOps compliance. Uphold high standards for code quality, system observability, and technical documentation. Act as the technical lead, setting direction and best practices for the full-stack engineering team. Mentor engineers, providing guidance on architecture, design patterns, and career growth. … cross-functional teams Deep experience with React, TypeScript, .NET Core, SOAP/REST APIs, and MySQL/PostgreSQL, Red Hat OpenShift, Kubernetes Understanding of DevOps, cloud deployments, and service observability Bonus: Interest/experience in AI, digital twins, Nvidia Omniverse SDK & APIs, Universal Scene Description What We Offer : Reimbursement for tuition and professional dues Three weeks of vacation and five More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Sr. Technical Account Manager, Payment Processing

Braintree, Essex, United Kingdom
Hybrid / WFH Options
Supercash
dispute representment solutions and fraud tooling. Provide ongoing support to the Risk Operations and Dispute Operations teams for risk rule tuning and monitoring. Enable Subscription Success: Design alerting/observability for internal retry logic and applicable vendors, focusing on key metrics like authorization and recovery rates. Act as First Responder (On-Call): Participate in the TAM on-call rotation, responding … while documenting post-mortems and recommendations. Build for Reliability & Continuity: Document fallback scenarios and potential vendor replacements. Track SLAs, vendor performance, and incidents to support business continuity planning. Champion Observability and AI Adoption: Deploy AI-based anomaly detection and observability tooling. Leverage AI tools like Cursor to interrogate code repositories and surface root causes faster. Collaborate Across Teams and Vendors … partners to escalate, triage, and resolve complex technical issues. Who we're looking for: Proficiency with SQL, Python, and/or other analytics tools to support data-driven troubleshooting, observability, and reporting. Hands-on experience working with or supporting payments orchestration across multiple processors (e.g., Braintree, Adyen, ). Familiarity with AI tooling for debugging or observability (e.g., Cursor) or experience More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: