Leeds, Yorkshire, United Kingdom Hybrid/Remote Options
Parallax Agency Ltd
Lambda, CloudFront, RDS, etc.) and Azure (you don't need to be an expert but being interested helps!) Promote strong engineering practices around code quality, automated testing, peer reviews, observability, and security and help instil a culture of quality and accountability in engineering. Collaborate closely with designers, product managers and QA to ensure solutions are user-focused, technically sound, and More ❯
strongly typed programming language and one dynamic programming. ideally Rust & nodeJS Experience with Public Cloud providers, ideally AWS Experience with CI/CD tooling and pipelines Any experience with Observability platforms such as Grafana would be advantageous. Our Commitment to Diversity and Inclusion Build your job in a place that thrives on diversity, inclusion, and belonging. We believe in maintaining More ❯
Liverpool, Merseyside, England, United Kingdom Hybrid/Remote Options
Broster Buchanan
scalability and resilience in applications handling large volumes of traffic and burst events. Work collaboratively with cross-functional teams, including DevOps, Infrastructure, and Product, to deliver robust systems. Leverage observability tools to monitor, alert, and troubleshoot application and integration health. Stay current on AI-driven software development practices (e.g., GPT-assisted development, Agentic AI workflows) and suggest practical implementations. Participate More ❯
experience) Full time/37 hrs a week/Permanent Huntingdon or Lincoln - Hybrid Make every drop of your potential count. Join our team We're looking for an Observability Engineer to join our Digital, Data and Technology team to help us transform how we monitor and manage our services. In this pivotal role, you'll design and implement observability … insight into the health, performance, and reliability of Anglian Water's digital platforms and products. Your work will enable proactive incident detection, root cause analysis, and continuous improvement, embedding observability as a core engineering discipline across our organisation. What will you be doing as an Observability Engineer? Design and implement observability solutions (logs, metrics, traces) using tools such as Prometheus … Grafana, Elastic Stack, Azure Monitor, Dynatrace. Build dashboards, alerts, and visualisations aligned to user and business needs. Integrate observability tooling into CI/CD pipelines and infrastructure-as-code. Standardise tooling across teams and support automation of alert responses and root cause analysis. Collaborate with development, operations, and platform teams to define SLIs, SLOs, and error budgets. Conduct root cause More ❯
Nelson, Lancashire, England, United Kingdom Hybrid/Remote Options
Lorien
cloud infrastructure on Azure or AWS. Driving Infrastructure as Code (IaC) practices using Terraform. Building and optimising CI/CD pipelines to accelerate delivery. Implementing and maintaining monitoring and observability with Prometheus and Grafana. Enabling team collaboration and incident response through Slack and other ChatOps tools. Leading, mentoring, and supporting engineers (or preparing to step into people management if you More ❯
Newcastle Upon Tyne, Tyne and Wear, England, United Kingdom Hybrid/Remote Options
Lorien
cloud infrastructure on Azure or AWS. Driving Infrastructure as Code (IaC) practices using Terraform. Building and optimising CI/CD pipelines to accelerate delivery. Implementing and maintaining monitoring and observability with Prometheus and Grafana. Enabling team collaboration and incident response through Slack and other ChatOps tools. Leading, mentoring, and supporting engineers (or preparing to step into people management if you More ❯
incidents. Flexible and adaptable to technical and business priorities. Nice-to-Have Experience supporting scientific or data-intensive applications. Background in post-mortem facilitation and follow-up. Enthusiasm for observability, performance tuning, and cost optimisation. More ❯
PaaS, governance, networking, and identity). AZ-104 and AZ-305 desirable. Skilled in scripting and automation (PowerShell required; Bicep/Terraform desirable). Experience with Azure monitoring and observability (Azure Monitor, Log Analytics, Datadog). Familiar with backup, disaster recovery, and business continuity tooling (Azure Backup, RSV, ASR). Strong working knowledge of networking concepts (VNets, VPNs, ExpressRoute, NSGs More ❯
England, Beckwith, North Yorkshire, United Kingdom Hybrid/Remote Options
The Bridge IT Recruitment
PaaS, governance, networking, and identity). AZ-104 and AZ-305 desirable. Skilled in scripting and automation (PowerShell required; Bicep/Terraform desirable). Experience with Azure monitoring and observability (Azure Monitor, Log Analytics, Datadog). Familiar with backup, disaster recovery, and business continuity tooling (Azure Backup, RSV, ASR). Strong working knowledge of networking concepts (VNets, VPNs, ExpressRoute, NSGs More ❯
newcastle-upon-tyne, tyne and wear, north east england, united kingdom
Zynk
specific data models. Integration Delivery – Apply integration patterns such as publish/subscribe, request/reply, file‐drop, and event‐driven processing across on‐premises and cloud endpoints. Quality & Observability – Implement integration tests, configure logging and metrics, diagnose performance bottlenecks, and ensure SLAs are consistently met. Collaboration – Partner with Solution Architects, Support Engineers, and end‐users to gather requirements, provide More ❯
talking smooth, automated, zero-touch deployments. Driving an Infrastructure as Code first approach (Terraform, Bicep... you know the drill). Building shared platform services that make developers’ lives easier observability, secure pipelines, vulnerability scanning, and more. Embedding a FinOps mindset to keep things efficient and cost-effective. Acting as the go-to Azure expert, sharing best practices and shaping architectural More ❯
Manchester, England, United Kingdom Hybrid/Remote Options
La Fosse
and cost Mentor service teams on cloud best practices Support service management processes (change, incident, DR) Ideal Experience: 5+ years in AWS cloud engineering Strong knowledge of network, backup, observability, and security Experience with ISO27001, NIST, or Cyber Essentials Proficiency in Terraform, Git, and CI/CD pipelines Familiarity with serverless architectures and AWS services (e.g. Lambda, EKS, CloudFront, GuardDuty More ❯
services/message buses and other architectural elements Deploy these applications using features such as containers to cloud leveraging CI/CD to support this process backed with good observability when running these in production Ensure quality through the creation of documentation and use of unit/integration/contract testing with a consideration of security/performance requirements More ❯
Our client is a a high-growth AI-driven observability company seeking an Observability Implementation Engineer to support a major enterprise customer through a large-scale migration from a legacy observability platform. This role blends hands-on technical work with customer-facing collaboration to ensure smooth adoption, efficient troubleshooting, and long-term platform success. Key Responsibilities Customer Support Provide daily … data continuity and accuracy. Enablement & Training Deliver hands-on training on platform features, query language, and troubleshooting workflows. Create tailored training materials and guide teams in leveraging AI-assisted observability capabilities. Cross-Functional Collaboration Partner with engineering and customer success teams to document issues, manage Jira tickets, and maintain clear technical narratives. Support war rooms and incident-related sessions as … needed. Required Qualifications 3+ years of experience with enterprise observability platforms (ELK preferred; others beneficial). Strong experience with AWS, GCP, or Azure, plus Kubernetes or other containerized environments. Proficiency in search and query languages (KQL, PromQL, SPL, Lucene, Elasticsearch DSL, etc.). Understanding of OpenTelemetry standards, metrics, logs, traces, and observability best practices. Experience with APIs, Infrastructure as Code More ❯
Leading and scaling a technically advanced team responsible for building and productionising a mission-critical backend platform. Architecting and maintaining high-availability, data-intensive systems across AWS with strong observability and monitoring foundations. Collaborating with cross-functional teams to integrate APIs and services, maintaining clean architecture principles. Driving technical quality through mentorship, test-driven development, and modern CI/CD More ❯
best they can be. Care about agility as much you care for scalability and availability. Continuous deployment keeps us focused on incremental releases. Take responsibility for platform health and observability, using our own data to understand user behaviour and drive product development. Skills we're looking for Core skills: A strong foundation in software engineering principles and deep knowledge of More ❯
Knutsford, Cheshire, England, United Kingdom Hybrid/Remote Options
Tenth Revolution Group
Build and maintain scalable data architecture using the Medallion model. Develop dbt/dataform models and reusable SQL transformations. Create impactful Tableau dashboards for business insights. Ensure data quality, observability, and governance. Drive CI/CD practices and collaborate across teams. What We're Looking For: Proven experience with BigQuery, dbt/dataform, and Tableau. Strong SQL and modern data More ❯
NUnit). Expertise in RESTful and GraphQL APIs, Git, and SOLID principles. Strategic thinking, strong communication, and a love for collaboration. Bonus: Experience with Azure, DevOps, Entity Framework, and observability practices. Why You'll Love It Here: Developer-led culture with hack days, and open access to leadership. Transparent progression and tailored development plans. Great perks: profit share, training budget More ❯
NUnit). Expertise in RESTful and GraphQL APIs, Git, and SOLID principles. Strategic thinking, strong communication, and a love for collaboration. Bonus: Experience with Azure, DevOps, Entity Framework, and observability practices. Why You'll Love It Here: Developer-led culture with hack days, and open access to leadership. Transparent progression and tailored development plans. Great perks: profit share, training budget More ❯
Employment Type: Permanent
Salary: £70000 - £80000/annum Pension, 25 days holiday, Profit Sha
Ringway, Altrincham, Cheshire, England, United Kingdom
The Hut Group
potential technical risks and develop strategies to mitigate them, ensuring that the application is secure, robust and reliable Champion performance optimisation across the frontend stack while ensuring accessibility and observability are baked into all solutions Deeply committed to crafting intuitive, impactful, and optimised user experiences that turn complex workflows into seamless, engaging journeys Share your knowledge within a democratic team More ❯
Wigan, Lancashire, England, United Kingdom Hybrid/Remote Options
Searchability
As part of their continued investment in reliability and platform performance, they are now seeking an experienced Site Reliability Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments … and performance of complex online platforms supporting high-volume transactions. Working closely with operations and product teams, you'll monitor production systems, develop automation to improve uptime, and refine observability to provide real-time insight into platform health. You'll also play a key role in performance testing, system tuning and incident management to ensure smooth operation during critical events. … SITE RELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and debugging Hands-on experience with AWS or another major cloud platform Knowledge of Kubernetes, Terraform, and Infrastructure More ❯
Wigan, Greater Manchester, United Kingdom Hybrid/Remote Options
Searchability (UK) Ltd
As part of their continued investment in reliability and platform performance, they are now seeking an experienced Site Reliability Engineer to strengthen their engineering function and help evolve their observability and automation capabilities. THE BENEFITS Hybrid working model (office and remote) Opportunity to define and lead SRE strategy within a collaborative culture Exposure to modern cloud-native and containerised environments … and performance of complex online platforms supporting high-volume transactions. Working closely with operations and product teams, you'll monitor production systems, develop automation to improve uptime, and refine observability to provide real-time insight into platform health. You'll also play a key role in performance testing, system tuning and incident management to ensure smooth operation during critical events. … SITE RELIABILITY ENGINEER ESSENTIAL SKILLS At least 2 years' experience working as an SRE Deep understanding of system reliability, scalability and performance tuning Experience with observability tools (Grafana, Prometheus, OpenTelemetry) Proficiency in a programming language such as Go or .NET for automation and debugging Hands-on experience with AWS or another major cloud platform Knowledge of Kubernetes, Terraform, and Infrastructure More ❯
Manchester Area, United Kingdom Hybrid/Remote Options
Morson Edge
Great opportunity for Senior Python Engineers to work remotely for a UK based AI scale-up. You'd join a large engineering department and would work within a cross functional product-based team responsible for building cloud-native, event-driven More ❯
of a mission-critical, cloud-native platform transforming the way the UK housing market operates. You'll lead the UK Platform Team, drive incident response, ensure stability, and champion observability and service governance - all while collaborating with global technology teams. Opportunities like this are rare: you'll help bring a proven digital housing platform to the UK, building the operational … industry. What do we need from you? Proven experience in Platform Operations, leading on platform reliability Hands-on familiarity with: AWS, Linux, Terraform, CI/CD pipelines Monitoring/observability tech such as Grafana, Prometheus, Splunk, New Relic, PagerDuty Basic diagnostics using SQL/PostgreSQL Strong background managing P1 and P2 incidents Ability to lead small teams Exposure to risk … ll act as the UK operational bridge between local and global engineering and service teams. Key Focus Areas Own UK platform operations end-to-end - from daily stability and observability to releases, patching, and service transitions. Lead major incidents with confidence, driving fast technical triage, clear comms, and rapid service restoration. Lift platform performance by owning SLAs/KRIs, chairing More ❯