76 to 100 of 116 Observability Jobs in the East of England

Remote Data Engineering Manager

Hiring Organisation
Airalo
Location
Luton, Bedfordshire, UK
workflow from reactive problem-solving to structured, agile delivery. Oversee the maintenance and optimization of high-performance data pipelines, implementing CI/CD automation, observability frameworks, and strict data quality gates. Roll up your sleeves when necessary to assist with complex code reviews, Python/Scala development, or unblocking … Python (and/or Scala) and advanced SQL across relational and non-relational databases. Experience implementing CI/CD, Infrastructure as Code, and observability/monitoring for data pipelines. Bachelor's degree in Computer Science, Engineering, Statistics, Information Systems, or a related quantitative field. Nice-to-have: Experience implementing data ...

Remote Data Engineering Manager

Hiring Organisation
Airalo
Location
Basildon, Essex, UK
workflow from reactive problem-solving to structured, agile delivery. Oversee the maintenance and optimization of high-performance data pipelines, implementing CI/CD automation, observability frameworks, and strict data quality gates. Roll up your sleeves when necessary to assist with complex code reviews, Python/Scala development, or unblocking … Python (and/or Scala) and advanced SQL across relational and non-relational databases. Experience implementing CI/CD, Infrastructure as Code, and observability/monitoring for data pipelines. Bachelor's degree in Computer Science, Engineering, Statistics, Information Systems, or a related quantitative field. Nice-to-have: Experience implementing data ...

Remote ML Infrastructure Lead

Hiring Organisation
Iproov
Location
Norwich, Norfolk, UK
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Remote ML Infrastructure Lead

Hiring Organisation
Iproov
Location
Cambridge, Cambridgeshire, UK
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Data Platform Solution Architect

Hiring Organisation
Jobleads-UK
Location
Basildon, England, United Kingdom
Design Documents (ADDs)*** Deep understanding of **cloud-native design patterns*** Experience in **performance tuning** across:* Snowflake* Airflow* Iceberg* Focus on **platform reliability, scalability, and observability*** Experience designing and operating **data platforms** in production environments #J-18808-Ljbffr ...

Remote Field Events Marketing Manager

Hiring Organisation
Arize Ai
Location
Norwich, Norfolk, UK
powerful ways to monitor, troubleshoot, and optimize their AI systems. That’s where we come in. Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to ship high-performing, reliable agents and applications. From first prototype to production scale, Arize AX unifies build, test ...

Remote Field Events Marketing Manager

Hiring Organisation
Arize Ai
Location
Watford, Hertfordshire, UK
powerful ways to monitor, troubleshoot, and optimize their AI systems. That’s where we come in. Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to ship high-performing, reliable agents and applications. From first prototype to production scale, Arize AX unifies build, test ...

Remote Field Events Marketing Manager

Hiring Organisation
Arize Ai
Location
Southend-on-Sea, Essex, UK
powerful ways to monitor, troubleshoot, and optimize their AI systems. That’s where we come in. Arize AI is the leading AI & Agent Engineering observability and evaluation platform , empowering AI engineers to ship high-performing, reliable agents and applications. From first prototype to production scale, Arize AX unifies build, test ...

Remote Cloud Engineer (AWS) Full time - Remote EU

Hiring Organisation
Retinai Medical
Location
Watford, Hertfordshire, UK
services. Manage and prioritize tasks in the cloud infrastructure backlog to address immediate needs and plan long-term improvements. Set up infrastructure monitoring and observability solutions, proactively addressing availability, performance or security issues. Learn the current infrastructure and take ownership of day to day cloud operational tasks and activities. Assess … with software version control and Git. Strong understanding of cloud networking concepts, including VPC, VPC Peering, Subnets, and Load Balancing. Familiarity with monitoring and observability tools for cloud environments, such as Grafana, Prometheus, OpenSearch, and the ELK stack. Strong analytical and problem-solving skills, with a proactive approach to challenges. ...

Remote Cloud Engineer (AWS) Full time - Remote EU

Hiring Organisation
Retinai Medical
Location
Ipswich, Suffolk, UK
services. Manage and prioritize tasks in the cloud infrastructure backlog to address immediate needs and plan long-term improvements. Set up infrastructure monitoring and observability solutions, proactively addressing availability, performance or security issues. Learn the current infrastructure and take ownership of day to day cloud operational tasks and activities. Assess … with software version control and Git. Strong understanding of cloud networking concepts, including VPC, VPC Peering, Subnets, and Load Balancing. Familiarity with monitoring and observability tools for cloud environments, such as Grafana, Prometheus, OpenSearch, and the ELK stack. Strong analytical and problem-solving skills, with a proactive approach to challenges. ...

Remote Cloud Engineer (AWS) Full time - Remote EU

Hiring Organisation
Retinai Medical
Location
Basildon, Essex, UK
services. Manage and prioritize tasks in the cloud infrastructure backlog to address immediate needs and plan long-term improvements. Set up infrastructure monitoring and observability solutions, proactively addressing availability, performance or security issues. Learn the current infrastructure and take ownership of day to day cloud operational tasks and activities. Assess … with software version control and Git. Strong understanding of cloud networking concepts, including VPC, VPC Peering, Subnets, and Load Balancing. Familiarity with monitoring and observability tools for cloud environments, such as Grafana, Prometheus, OpenSearch, and the ELK stack. Strong analytical and problem-solving skills, with a proactive approach to challenges. ...

Remote Senior Software Engineer II, Infra Engineering (Remote UK)

Hiring Organisation
Optro Gmbh
Location
Luton, Bedfordshire, UK
SaaS applications globally in the cloud. Drive infrastructure features end-to-end, from design documentation through implementation, rollout, and operational ownership. Build and deliver observability tools and analyze data, working with the application development team to ensure a consistently superb customer experience. Continue to grow automation for infrastructure provisioning, developer … working with cloud services providers ( AWS o r Azure preferred) Experience with Infrastructure as Code and other cloud automation tools (HashiCorp suite) Experience building observability tooling, dashboards, and alerting for monitoring distributed systems Experience designing globally distributed systems in a cloud-based, container-based world Comfort writing clear design documentation ...

Remote Senior Software Engineer II, Infra Engineering (Remote UK)

Hiring Organisation
Optro Gmbh
Location
Norwich, Norfolk, UK
SaaS applications globally in the cloud. Drive infrastructure features end-to-end, from design documentation through implementation, rollout, and operational ownership. Build and deliver observability tools and analyze data, working with the application development team to ensure a consistently superb customer experience. Continue to grow automation for infrastructure provisioning, developer … working with cloud services providers ( AWS o r Azure preferred) Experience with Infrastructure as Code and other cloud automation tools (HashiCorp suite) Experience building observability tooling, dashboards, and alerting for monitoring distributed systems Experience designing globally distributed systems in a cloud-based, container-based world Comfort writing clear design documentation ...

Remote Senior Software Engineer II, Infra Engineering (Remote UK)

Hiring Organisation
Optro Gmbh
Location
Basildon, Essex, UK
SaaS applications globally in the cloud. Drive infrastructure features end-to-end, from design documentation through implementation, rollout, and operational ownership. Build and deliver observability tools and analyze data, working with the application development team to ensure a consistently superb customer experience. Continue to grow automation for infrastructure provisioning, developer … working with cloud services providers ( AWS o r Azure preferred) Experience with Infrastructure as Code and other cloud automation tools (HashiCorp suite) Experience building observability tooling, dashboards, and alerting for monitoring distributed systems Experience designing globally distributed systems in a cloud-based, container-based world Comfort writing clear design documentation ...

Remote Senior Director, Engineering- X-Ops Platform

Hiring Organisation
Sophos
Location
Southend-on-Sea, Essex, UK
vision, strategy, and operating model for the X-Ops Platform organisation (e.g., Delivery Roadmap, AI First Developer Experience, CI/CD, Observability, SRE, Cloud & Infrastructure Enablement) Lead, coach, and develop engineering leaders (directors, managers and senior ICs), building high-performing teams with clear ownership and strong engineering culture Own platform … record of building reliable, secure platforms and improving developer productivity through pragmatic, outcome-driven investment Experience establishing operational excellence practices (incident response, on-call, observability, SLOs, post-incident learning) and driving continuous improvement Ability to set strategy and translate it into execution via roadmaps, prioritisation, and clear measures of success ...

Remote Staff Backend Software Engineer (Remote)

Hiring Organisation
Spoke
Location
Peterborough, Cambridgeshire, UK
things running smoothly in production. In practice, that means building and improving backend services, supporting distributed systems, and making sure we have the right observability approach in place, from logging through to alerting, so issues are easier to spot and quicker to resolve. You’ll take the lead across multiple … decisions clearly and thoughtfully The technology and tools we build with Programming Language: Node/Typescript Databases: PostgreSQL, Firestore Cloud Provider: Google Cloud Monitoring, Observability & Logging: Prometheus, Grafana, Honeycomb, Google Cloud A bit about us Back in 2017, we saw an issue with last-mile delivery. Delivery drivers have ...

Remote Staff Backend Software Engineer (Remote)

Hiring Organisation
Spoke
Location
Stevenage, Hertfordshire, UK
things running smoothly in production. In practice, that means building and improving backend services, supporting distributed systems, and making sure we have the right observability approach in place, from logging through to alerting, so issues are easier to spot and quicker to resolve. You’ll take the lead across multiple … decisions clearly and thoughtfully The technology and tools we build with Programming Language: Node/Typescript Databases: PostgreSQL, Firestore Cloud Provider: Google Cloud Monitoring, Observability & Logging: Prometheus, Grafana, Honeycomb, Google Cloud A bit about us Back in 2017, we saw an issue with last-mile delivery. Delivery drivers have ...

Remote Staff Backend Software Engineer (Remote)

Hiring Organisation
Spoke
Location
Luton, Bedfordshire, UK
things running smoothly in production. In practice, that means building and improving backend services, supporting distributed systems, and making sure we have the right observability approach in place, from logging through to alerting, so issues are easier to spot and quicker to resolve. You’ll take the lead across multiple … decisions clearly and thoughtfully The technology and tools we build with Programming Language: Node/Typescript Databases: PostgreSQL, Firestore Cloud Provider: Google Cloud Monitoring, Observability & Logging: Prometheus, Grafana, Honeycomb, Google Cloud A bit about us Back in 2017, we saw an issue with last-mile delivery. Delivery drivers have ...

Remote Staff Backend Software Engineer (Remote)

Hiring Organisation
Spoke
Location
Chelmsford, Essex, UK
things running smoothly in production. In practice, that means building and improving backend services, supporting distributed systems, and making sure we have the right observability approach in place, from logging through to alerting, so issues are easier to spot and quicker to resolve. You’ll take the lead across multiple … decisions clearly and thoughtfully The technology and tools we build with Programming Language: Node/Typescript Databases: PostgreSQL, Firestore Cloud Provider: Google Cloud Monitoring, Observability & Logging: Prometheus, Grafana, Honeycomb, Google Cloud A bit about us Back in 2017, we saw an issue with last-mile delivery. Delivery drivers have ...

Remote Sr. Software Engineer, Fullstack (UK)

Hiring Organisation
First Up
Location
Luton, Bedfordshire, UK
post-incident reviews in a "you build it, you run it" environment. Identify, analyse, and resolve system availability, reliability, and performance issues, contributing to observability and resiliency improvements. Partner with Product Management and Design to translate business requirements into scalable technical solutions. Minimum Qualifications Bachelor's degree in Computer Science … HRIS platforms such as Workday, SAP SuccessFactors, Dayforce, or similar enterprise HR systems. Experience with Kubernetes, Docker, and Helm. Experience with Datadog or similar observability and monitoring platforms. Demonstrated use of Generative AI tools or coding agents in development workflows. Experience in enterprise SaaS organisations, particularly HR Tech or regulated ...

Remote Sr. Software Engineer, Fullstack (UK)

Hiring Organisation
First Up
Location
Ipswich, Suffolk, UK
post-incident reviews in a "you build it, you run it" environment. Identify, analyse, and resolve system availability, reliability, and performance issues, contributing to observability and resiliency improvements. Partner with Product Management and Design to translate business requirements into scalable technical solutions. Minimum Qualifications Bachelor's degree in Computer Science … HRIS platforms such as Workday, SAP SuccessFactors, Dayforce, or similar enterprise HR systems. Experience with Kubernetes, Docker, and Helm. Experience with Datadog or similar observability and monitoring platforms. Demonstrated use of Generative AI tools or coding agents in development workflows. Experience in enterprise SaaS organisations, particularly HR Tech or regulated ...

Principal Private Cloud Engineer

Hiring Organisation
Jobleads-UK
Location
Cambridge, England, United Kingdom
Job Description You will define and lead the technical strategy for a private cloud platform based on OpenStack, delivering scalable and reliable infrastructure services for engineering teams. The platform underpins large-scale engineering workloads and ...

Dynatrace/Observability Engineer -6months-Basildon

Hiring Organisation
Kirtana Consulting
Location
Basildon, Essex, United Kingdom
Employment Type
Contract
Contract Rate
GBP Annual
Kirtana consulting is looking for Dynatrace/Observability Engineer for 6months rolling contract in Basildon. Job description: Role Title: Dynatrace Expert This role is a combination of Dynatrace/Observability Engineer and Lead COE member. Candidate should be able to hands-on Dynatrace activities if application teams … Dynatrace. -Dynatrace -AppDynamics -Dynatrace -preferably Dynatrace-certified Talent should have 6+ years' work experience. Design and implement E2E observability for business-critical payment and banking workflows Configure request attributes to enrich traces with business context (transaction IDs, payment types) Trace distributed transactions across microservices, MQ, mainframes, and APIs using PurePath ...

Remote Senior Software Engineer - Python and Data Ecosystem

Hiring Organisation
Clickhouse
Location
Watford, Hertfordshire, UK
customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads. The company’s sustained, accelerating momentum was recently validated by a $400M Series D financing round. Over the past three months, customers including Capital … orchestration platforms, and AI tooling. Our work directly shapes how companies process massive datasets: real-time analytics platforms ingesting millions of events per second, observability systems monitoring global infrastructure, and increasingly, the AI-powered data applications redefining how teams work with data. We collaborate closely with the open-source community ...

Remote Senior Software Engineer - Python and Data Ecosystem

Hiring Organisation
Clickhouse
Location
Chelmsford, Essex, UK
customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads. The company’s sustained, accelerating momentum was recently validated by a $400M Series D financing round. Over the past three months, customers including Capital … orchestration platforms, and AI tooling. Our work directly shapes how companies process massive datasets: real-time analytics platforms ingesting millions of events per second, observability systems monitoring global infrastructure, and increasingly, the AI-powered data applications redefining how teams work with data. We collaborate closely with the open-source community ...