slough, south east england, united kingdom Hybrid / WFH Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
london, south east england, united kingdom Hybrid / WFH Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
colleagues and clients across the Snowflake ecosystemExperience in design and delivering business solutions on other modern data platforms (e.g. Databricks, Azure, AWS or GCP native stacks)Experience with platform observability and CI/CD for data platformsHands-on experience with modern data engineering tools such as dbt, Fivetran, Matillion or AirflowHistory of supporting pre-sales activities in a product or More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Awaze
environment. Partner with Product to balance innovation with reliability, ensuring our core platforms can scale to support millions of bookings. Champion engineering best practices such as CI/CD, observability, automated testing, and platform reliability. Create an environment where teams can experiment, learn, and deliver value quickly and safely. Play a key role in shaping how we attract, develop, and More ❯
bolton, greater manchester, north west england, united kingdom Hybrid / WFH Options
Awaze
environment. Partner with Product to balance innovation with reliability, ensuring our core platforms can scale to support millions of bookings. Champion engineering best practices such as CI/CD, observability, automated testing, and platform reliability. Create an environment where teams can experiment, learn, and deliver value quickly and safely. Play a key role in shaping how we attract, develop, and More ❯
warrington, cheshire, north west england, united kingdom Hybrid / WFH Options
Awaze
environment. Partner with Product to balance innovation with reliability, ensuring our core platforms can scale to support millions of bookings. Champion engineering best practices such as CI/CD, observability, automated testing, and platform reliability. Create an environment where teams can experiment, learn, and deliver value quickly and safely. Play a key role in shaping how we attract, develop, and More ❯
Be Doing Building the backend for various user-facing features Optimise client APIs and services Improve database and infrastructure performance by implementing caching solutions and optimising data queries. Improve observability, monitoring, and alerting for our service so that we can better respond to operational incidents Scale our service via architectural changes as well as infrastructure improvements What You'll Actually More ❯
or communicating with robotic automation systems and integrating with physical devices Desktop app development with Electron CI/CD setup, rollback strategies, and deployment automation Sentry, NewRelic, or other observability tooling implementation More ❯
Birmingham, West Midlands, United Kingdom Hybrid / WFH Options
ByteHire
or communicating with robotic automation systems and integrating with physical devices Desktop app development with Electron CI/CD setup, rollback strategies, and deployment automation Sentry, NewRelic, or other observability tooling implementation More ❯
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
require both strategic foresight and technical precision. Set engineering standards by developing modular, performant, and maintainable code that leads by example. Own the full product lifecycle-including design, deployment, observability, and long-term maintenance-ensuring platform reliability at scale. Collaborate cross-functionally with Product, Quant, and Engineering leadership to align technical execution with business goals. Apply advanced software design methodologies More ❯
cloud services. Partner with engineers to shape code for testability and embed quality early in the development process. Lead cross-functional quality initiatives to improve CI/CD pipelines, observability, and release readiness. Drive performance, load, and resilience testing , especially for latency-sensitive, real-time systems. Mentor other SDETs and developers in automation strategy, debugging, and risk mitigation. Own root More ❯
Gloucester, Gloucestershire, United Kingdom Hybrid / WFH Options
Howden Group
experience meeting relevant standards. Experience with Test-driven Development and a strong commitment to high quality, maintainable and easy to understand code. Strong understanding of security. Experience of using observability, monitoring tools, and gathering data insights. You have a detailed exposure to Azure cloud technologies. A strong communicator and facilitator. An experimental and scientific mindset leading to data-led decision More ❯
Hereford, Herefordshire, West Midlands, United Kingdom Hybrid / WFH Options
Twinstream Limited
ensuring the availability, performance, and resilience of our secure, high-impact services. You'll work with development and support teams to evolve infrastructure, streamline delivery pipelines, and strengthen system observability — ensuring performance bottlenecks and reliability risks are resolved before they ever reach production. Expect a technically rich environment, diverse challenges, and the opportunity to make a measurable difference. Key Responsibilities … Reliability Engineer: Partner with Software Engineers to enhance reliability and performance across complex systems Collaborate with SysAdmins to automate toil and eliminate manual intervention Build smarter monitoring, logging, and observability pipelines to detect and resolve issues early Support and improve development environments to hit delivery and quality goals Research new tools, services, and architectures to drive scalability and resilience Expand … Ansible, Chef, etc.) Skilled with Docker and Kubernetes/OpenShift/Docker Swarm Hands-on experience building and maintaining CI/CD pipelines (e.g. Jenkins) Deep understanding of monitoring & observability tools (Grafana, Prometheus, InfluxDB) Solid grounding in Linux, network security, SQL, and AWS (EC2, S3, RDS, Lambda) Comfortable with MQ messaging (RabbitMQ or similar) Bonus points for: Experience with Azure More ❯
GraphQL). Establish CI/CD and GitOps practices with GitHub Actions and ArgoCD, including automated testing, vulnerability scanning, and environment promotion workflows. Drive the definition and implementation of observability standards - Prometheus, Grafana, Loki/ELK, Jaeger, Sentry - enabling end-to-end visibility and SLA tracking. Define scalability and reliability patterns (KEDA, HPA, circuit breakers, bulkheads, caching tiers) and ensure … Next.js, micro frontends, monorepos) and API integration patterns. Proficiency in API and event contract design using OpenAPI and AsyncAPI; knowledge of GraphQL federation is a plus. Strong background in observability, monitoring, and tracing , with Prometheus/Grafana/ELK or equivalent. Familiarity with cloud agnostic deployments (AWS, GCP, or Azure) and cost/performance trade offs. Excellent technical leadership, communication More ❯
manchester, north west england, united kingdom Hybrid / WFH Options
On the Beach
ll Be Doing Day To Day Technical Leadership : Serve as a technical authority for the Platforms Team, guiding the design, implementation, and optimization of our AWS, GitOps, Monitoring and Observability, Developer Portal and Kubernetes (EKS) platforms. Architecture and Design : Architect and design scalable, reliable, and secure Platform solutions, ensuring they meet current and future needs. Platform Management : Lead the management … with a modern programming/scripting languages such as C#, Go, Python, TypeScript, or Java. Advanced scripting skills in languages such as Python, Bash, or similar. Expertise with monitoring, observability and Log analytics tooling such as New Relic Prometheus, Grafana, ELK stack. Any experience implementing AI Ops and/or AI Coding tooling to enhance Developer Experience and enable efficient More ❯
innovative solutions to hundreds of thousands of businesses and empower millions of developers to craft personalized customer experiences. This role is for a Software Engineer on Twilio's Platform Observability team, focused on rebuilding and unifying our observability stack to enable faster incident response, deeper insights, and more cost-effective platform operations. You'll help re-architect how telemetry flows … and is utilized at Twilio-making it structured, accessible, affordable, and actionable. Over the next 3 years, Twilio is rebuilding nearly every component of our observability platform, from data collection to real-time analytics. You will drive core initiatives that move Twilio from fragmented tooling to a unified, OpenTelemetry-first observability stack built for scale. You'll lead technically and … designing platform components, influencing architectural decisions, mentoring engineers, and engaging with teams across Platform Engineering and R&D. Responsibilities Lead the end-to-end architecture and delivery of key observability platform components, with a focus on reliability, scalability, and usability. Drive consistency and quality across all observability signals-logs, metrics, traces, and continuous profiling-building intuitive workflows for engineers. Serve More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability EngineerApply fast, check the full description by scrolling below to find out the full requirements for this role. Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise … observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines, distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki. Analyze telemetry More ❯