cambridge, east anglia, United Kingdom Hybrid / WFH Options
Signify Technology
and performance management Solid commercial C++ experience on complex systems Proven experience with large, multi-component systems and distributed team practices Strong background in observability and logging Familiarity with infrastructure-as-code and automated deployments (Terraform, Helm or Flux) We make an active choice to be inclusive towards everyone every More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Signify Technology
and performance management Solid commercial C++ experience on complex systems Proven experience with large, multi-component systems and distributed team practices Strong background in observability and logging Familiarity with infrastructure-as-code and automated deployments (Terraform, Helm or Flux) We make an active choice to be inclusive towards everyone every More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
TalentCo
to build scalable, low-latency systems. Collaborate on every part of the stack—from data ingestion and indexing to query planning and optimisation. Build observability, reliability, and scalability into a product being designed for some of the most demanding data applications in the world. Operate in a high-agency environment More ❯
administration and troubleshooting. Proven experience in implementing and managing CI/CD pipelines and Infrastructure as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Containerization (Docker More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Wave Talent
and AI goals. 🛠️ Tech Stack : React, .NET 🧠 You’ll bring : Proven expertise in React and .NET in production environments Strong experience with system design, observability, and performance optimisation Great communication and stakeholder engagement skills Background in financial, payroll, or reporting systems is a bonus 💼 Contract Details Day rate : TBC Length More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Signify Technology
and performance management Solid commercial C++ experience on complex systems Proven experience with large, multi-component systems and distributed team practices Strong background in observability and logging Familiarity with infrastructure-as-code and automated deployments (Terraform, Helm or Flux) We make an active choice to be inclusive towards everyone every More ❯
applications in life sciences/health tech. Develop automated techniques for agentic system design and evaluation. Design tools and architectures for AI agents, including observability pipelines. Develop and deploy deep learning models, including fine-tuning LLMs. Mentor and grow the AI team, fostering innovation and best practices. Manage a portfolio More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Tembo
and spike solutions to evaluate technologies, libraries, and approaches for improving system reliability, auditing, and financial reconciliation accuracy. Open Standards: Support our commitment to observability and open standards. Contribute to initiatives around OpenTelemetry, OpenAPI, and other tools that improve transparency and traceability across services. About you At least 5 years More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Promote Project
Secondary Responsibilities Assist with support and bug triage. Assist with CI/CD pipeline as necessary. Assist with E2E tests as necessary. Improve application observability with logging and automated alerting. Explain technical concepts to non-technical stakeholders. Guide and mentor other engineers of all levels with their professional growth and More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Durlston Partners
deployment, monitoring, and incident response Tune performance across OS, network, and cloud layers — this role is hands-on and detail-oriented Improve system resilience, observability, and security in a high-stakes production environment Requirements: Fluent in Linux — not just using it, but understanding how it works under the hood Advanced … Docker (Kubernetes is a plus), infrastructure-as-code, and CI/CD tooling Strong scripting and automation experience in Python and Bash Familiarity with observability stacks (Prometheus, OpenTelemetry, eBPF) Cloud infrastructure experience (AWS/GCP/Azure), with attention to IAM and software supply chain security Curious, persistent, and comfortable More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
CATCHES
Platform for high-throughput, low-latency workloads. Implement infrastructure-as-code (Terraform/Bicep) and automated release workflows that enable true continuous delivery. Drive observability: log aggregation, metrics, distributed tracing and on-call runbooks. Champion security, cost-efficiency and performance tuning across our services. Collaborate with product and platform teams … Excellent communication skills and a track record of cross-team collaboration. Nice to have: Kubernetes expertise (GKE/AKS/EKS) and container-native observability stacks (Prometheus/Grafana). NoSQL experience (Firestore, Cosmos DB, DynamoDB, MongoDB). Experience with game-backend scales, real-time services or hybrid cloud/… PostgreSQL, MS SQL Server, Redis. Messaging: Pub/Sub, RabbitMQ, Azure Service Bus. Infra & Ops: Docker, Kubernetes, Terraform/Bicep, GitHub Actions, Cloud Build. Observability: OpenTelemetry, Grafana, Elastic. More ❯
At Ansys, we are reimagining the way complex simulation software is deployed—across cloud, on-prem, and hybrid environments. We’re looking for a Cloud Platform Engineer who specializes in Infrastructure as Code (IaC) to lead the design, implementation, and More ❯
performing hands-on engineering tasks. This is a day-rate contractor role, immediate interview with start date shortly after. Your Responsibilities: Design, implement, maintain observability solutions using Elastic in a Kubernetes environment Provide support during pre-sales phase Onboard complex data sources to an observability tool Manage client demands and … consultative approach Excellent analytical and problem solving skills Ability to build strong relationships and explain technical details to a diverse audience Up to date observability best practice guidelines More ❯
Expertise in Kubernetes including AKS EKS containerization and Helm Proven ability to meet and maintain SOC 2 or equivalent compliance Strong background in automation observability and GitOps workflows Comfortable using AI coding tools like GitHub Copilot Cursor or Claude to enhance delivery Bonus if you have experience supporting hybrid or … AKS API Management and DevOps Pipelines and AWS including EKS Lambda and CloudFormation Infrastructure as Code and GitOps : Terraform Bicep Pulumi ArgoCD and FluxCD Observability : Prometheus Grafana OpenTelemetry and Datadog Security and Compliance : HashiCorp Vault Azure Key Vault AWS KMS OPA Gatekeeper and Drata or similar AI Coding Tools : GitHub More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Harrington Starr
scale - all through smart automation and modern cloud-native infrastructure. They’re looking to bring on a Site Reliability Engineer with deep experience in observability . If you’ve worked with tools like Prometheus in AWS , supported development teams with tracing and performance insights , and thrive in a high-scale … distributed environment - this could be a great next step. What You’ll Be Doing: Managing and improving observability tools like Prometheus, Grafana, and CloudWatch Helping product teams with tracing and monitoring to improve performance and reliability Defining and improving SLIs/SLOs , automating tasks, and reducing operational noise Working with … Terraform, and CI/CD tools What They’re Looking For: Experience in SRE or DevOps roles in a production environment Strong knowledge of observability tools , especially Prometheus in AWS Experience with tracing , metrics, and logs to support development teams Skills in Python or Go , and a good understanding of More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Halian
team, based remotely in the U.S. We’re looking for a technically skilled and automation-driven individual with strong experience in cloud infrastructure, and observability tools to help scale our client’s services to millions of endpoints globally. This is an exciting opportunity to work at the core of platform … cause analysis (RCA) and create technical documentation and SOPs. Develop scripts and tools to automate infrastructure provisioning and application deployment. Implement best practices for observability and monitoring using tools like New Relic, DataDog, or Splunk. Influence design decisions to ensure scalable, secure architecture and high availability. Key Requirements: 5+ years More ❯
a Platform Engineer, you’ll help design, build, and support the infrastructure and tooling that underpins critical systems – from CI/CD pipelines and observability tooling to service deployment and runtime environments. You’ll be part of a high-trust team that values clean code, quick iteration, and leaving things … tooling and services Hands-on experience with AWS, Kubernetes, Docker, and modern CI/CD pipelines Familiarity with infrastructure-as-code (e.g., Terraform) and observability tooling (e.g., Prometheus, Grafana) Comfortable working on distributed systems and improving developer workflows A product mindset and a collaborative approach to problem-solving Experience with More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Ocho
Serverless, and S3 for cloud-native data solutions • Collaborate with frontend (Vue.js) and data platform engineers (Snowflake, Airflow) • Contribute to CI/CD pipelines, observability, and day-2 operational tooling • Partner with product and architecture teams to translate customer needs into working systems • Engage in cross-regional collaboration (US, EMEA … S3 • Experience with event-driven design using SQS, SNS, EventBridge • Comfortable working in containerized and serverless contexts (12-factor apps) • Hands-on experience with observability stacks: metrics, traces, logs • Strong communicator able to interface confidently with both technical and non-technical audiences Bonus Experience • Familiarity with IaC frameworks (CloudFormation, Terraform More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Primis
cloud and on-prem setup. The Network Engineer will be engaging various areas of design, implementation and ongoing responsibility for hosting environments and the observability, highly available info-sec with IaC. The Network Engineer will be hands on and working alongside various other teams and stakeholders overseeing the DevOps and … Engineer will be involved with the design, implementation and ongoing management of the hosting environment. The main purpose of the environments will be embracing observability, high-availability and infosec, along with IaC. You’ll be making sure that you support the system and network infrastructure with switches, routers, load balancers More ❯
web applications. Collaborate closely with DevOps on CI/CD pipelines, deployment workflows, infrastructure, and SecOps compliance. Uphold high standards for code quality, system observability, and technical documentation. Act as the technical lead, setting direction and best practices for the full-stack engineering team. Mentor engineers, providing guidance on architecture … with React, TypeScript, .NET Core, SOAP/REST APIs, and MySQL/PostgreSQL, Red Hat OpenShift, Kubernetes Understanding of DevOps, cloud deployments, and service observability Bonus: Interest/experience in AI, digital twins, Nvidia Omniverse SDK & APIs, Universal Scene Description What We Offer : Reimbursement for tuition and professional dues Three More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
Trust In SODA
Programme Manager – AI Ops & Observability Rollout Location: Hybrid (40% in-office minimum) | Cambridge Type: 6-month Contract Rate: £670 - £710 per day Inside IR35 A major enterprise is seeking an experienced Programme Manager to lead the organisation-wide rollout of a new AI Ops and Observability Platform . This strategic … resilience, reducing downtime, and enabling proactive incident management. You’ll drive end-to-end delivery, from roadmap ownership to stakeholder alignment, while shaping how observability is embedded into tools, workflows, and culture. This is a high-impact role requiring coordination across engineering, IT, and business teams. Key Responsibilities: Lead planning … and execution of the observability platform rollout Manage roadmap, risks, and dependencies across functions Oversee change management, communications, and adoption strategies Engage stakeholders at all levels to ensure alignment and delivery Track and report KPIs to demonstrate business value What You Bring: Proven experience in large-scale programme delivery or More ❯
cambridge, east anglia, United Kingdom Hybrid / WFH Options
Trust In SODA
Programme Manager – AI Ops & Observability Rollout Location: Hybrid (40% in-office minimum) | Cambridge Type: 6-month Contract Rate: £670 - £710 per day Inside IR35 A major enterprise is seeking an experienced Programme Manager to lead the organisation-wide rollout of a new AI Ops and Observability Platform . This strategic … resilience, reducing downtime, and enabling proactive incident management. You’ll drive end-to-end delivery, from roadmap ownership to stakeholder alignment, while shaping how observability is embedded into tools, workflows, and culture. This is a high-impact role requiring coordination across engineering, IT, and business teams. Key Responsibilities: Lead planning … and execution of the observability platform rollout Manage roadmap, risks, and dependencies across functions Oversee change management, communications, and adoption strategies Engage stakeholders at all levels to ensure alignment and delivery Track and report KPIs to demonstrate business value What You Bring: Proven experience in large-scale programme delivery or More ❯
Cambridge, south west england, United Kingdom Hybrid / WFH Options
Trust In SODA
Programme Manager – AI Ops & Observability Rollout Location: Hybrid (40% in-office minimum) | Cambridge Type: 6-month Contract Rate: £670 - £710 per day Inside IR35 A major enterprise is seeking an experienced Programme Manager to lead the organisation-wide rollout of a new AI Ops and Observability Platform . This strategic … resilience, reducing downtime, and enabling proactive incident management. You’ll drive end-to-end delivery, from roadmap ownership to stakeholder alignment, while shaping how observability is embedded into tools, workflows, and culture. This is a high-impact role requiring coordination across engineering, IT, and business teams. Key Responsibilities: Lead planning … and execution of the observability platform rollout Manage roadmap, risks, and dependencies across functions Oversee change management, communications, and adoption strategies Engage stakeholders at all levels to ensure alignment and delivery Track and report KPIs to demonstrate business value What You Bring: Proven experience in large-scale programme delivery or More ❯
processes for speed and quality, and drive product-focused operating models. Proficiency in DevOps, CI/CD, DevSecOps, Site Reliability Engineering (SRE), developer experience, observability, and hybrid/multi-cloud environments. Bonus: Familiarity with observability platforms and practical experience working with development teams to enhance monitoring and telemetry practices. Please More ❯