corporate offices, supply chains, and data centres. You will proactively mitigate risks by reviewing analytics on network metrics using Meraki Dashboards and other network observability toolsets. Additionally, you will oversee the service performance of global network managed services, advocating for regional service teams on continuous improvement initiatives. You will manage More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices like GitOps, Infrastructure as Code, DevSecOps automation, and self-service enablement, to help development teams ship faster, safer, and … more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through platform tools, reusable Terraform modules … DB, etc.) Strong Infrastructure as Code skills with Terraform (v1.7+) Experience with CI/CD pipelines, GitOps, and automation tools (PowerShell, Bash) Familiarity with observability and incident tools like Datadog, ELK, and synthetic monitoring Solid understanding of networking (TCP/IP, Load Balancing, DNS, Routing) Good knowledge of DevSecOps practices More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
LinuxRecruit
prior working experience of approaching tasks methodically to solve engineering problems is key. In addition to your programming skills, knowledge of metrics, monitoring and observability would be highly beneficial. Experience with the full SDLC and deployments of code through pipelines into containers - modern cloud native software engineering, would be beneficial. More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
On the Beach
Write clean, maintainable code and participate actively in code reviews. Operational Support: Support production systems, identify root causes of issues, and contribute to improving observability and automation. Knowledge Sharing: Document systems and help onboard and guide new team members. Skills and Experience Cloud Experience: Solid hands-on experience with AWS More ❯
bolton, greater manchester, north west england, United Kingdom Hybrid / WFH Options
On the Beach
Write clean, maintainable code and participate actively in code reviews. Operational Support: Support production systems, identify root causes of issues, and contribute to improving observability and automation. Knowledge Sharing: Document systems and help onboard and guide new team members. Skills and Experience Cloud Experience: Solid hands-on experience with AWS More ❯
reporting and security leads to ensure data platforms are meeting product needs to service client expectations. Guide teams to ensure a high degree of observability of data platform reliability and performance, working alongside the Head of Platform to enhance visibility of these metrics throughout the business. Drive innovation in related More ❯
needs. Project Delivery: • Possess effective project management skills and own the successful delivery of key platform initiatives including CI/CD improvements, infrastructure upgrades, observability enhancements and technology swapouts. • Manage capacity planning and prioritisation of team work, balancing short-term demands with long-term technical vision • Have a commercial mindset … record of designing, building, and operating scalable, secure, and cost-efficient cloud platforms. · Hands-on experience implementing CI/CD pipelines, infrastructure automation, and observability frameworks. · Deep understanding of operational excellence practices, including system reliability, incident management, and SRE methodologies. · Demonstrated ability to embed security and compliance practices across the More ❯
needs. Project Delivery: • Possess effective project management skills and own the successful delivery of key platform initiatives including CI/CD improvements, infrastructure upgrades, observability enhancements and technology swapouts. • Manage capacity planning and prioritisation of team work, balancing short-term demands with long-term technical vision • Have a commercial mindset … record of designing, building, and operating scalable, secure, and cost-efficient cloud platforms. · Hands-on experience implementing CI/CD pipelines, infrastructure automation, and observability frameworks. · Deep understanding of operational excellence practices, including system reliability, incident management, and SRE methodologies. · Demonstrated ability to embed security and compliance practices across the More ❯
Who we are looking for A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability … maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles … including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability … maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles … including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
bet365 Group
A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting … maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles … including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Knowledge and experience of modern software development techniques and lifecycles. Experience with More ❯
reliability, and scalability. Work cross-functionally to enhance supplier integrations, improve data accuracy, and optimise booking processes. Champion best practices across CI/CD, observability, and incident response. Remain technically involved through code reviews, architectural guidance, and complex problem-solving. Ideal Candidate: Proven track record of leading engineering teams in … ideally AWS). Hands-on experience with Kubernetes , Terraform , and modern DevOps pipelines. Familiarity with GraphQL and API-first architectures. Strong focus on monitoring, observability, and system performance. Experience with real-time data processing and transactional system design. Skilled at balancing technical debt, delivery speed, and operational stability. Passionate about More ❯
stockport, north west england, United Kingdom Hybrid / WFH Options
Flowmentum, Inc
Implement infrastructure-as-code with Terraform and PowerShell Collaborate across dev, QA, and SRE teams to drive reliability, security, and performance Improve deployment workflows, observability, and platform resiliency at scale ✅ What You Bring: Expert-level Azure networking skills — VNET peering, routing, private endpoints, DNS, WAFs, etc. Advanced .NET Framework 4.6 … with Azure DevOps pipelines and CI/CD best practices Solid scripting ability in PowerShell and infrastructure provisioning with Terraform Familiarity with QA and observability tools a plus (e.g., App Insights, Log Analytics) 🌟 Why Join Us: Remote-first and globally distributed team ROWE culture – results over hours Competitive compensation + More ❯
manchester, north west england, United Kingdom Hybrid / WFH Options
Flowmentum, Inc
Implement infrastructure-as-code with Terraform and PowerShell Collaborate across dev, QA, and SRE teams to drive reliability, security, and performance Improve deployment workflows, observability, and platform resiliency at scale ✅ What You Bring: Expert-level Azure networking skills — VNET peering, routing, private endpoints, DNS, WAFs, etc. Advanced .NET Framework 4.6 … with Azure DevOps pipelines and CI/CD best practices Solid scripting ability in PowerShell and infrastructure provisioning with Terraform Familiarity with QA and observability tools a plus (e.g., App Insights, Log Analytics) 🌟 Why Join Us: Remote-first and globally distributed team ROWE culture – results over hours Competitive compensation + More ❯
bolton, greater manchester, north west england, United Kingdom Hybrid / WFH Options
Flowmentum, Inc
Implement infrastructure-as-code with Terraform and PowerShell Collaborate across dev, QA, and SRE teams to drive reliability, security, and performance Improve deployment workflows, observability, and platform resiliency at scale ✅ What You Bring: Expert-level Azure networking skills — VNET peering, routing, private endpoints, DNS, WAFs, etc. Advanced .NET Framework 4.6 … with Azure DevOps pipelines and CI/CD best practices Solid scripting ability in PowerShell and infrastructure provisioning with Terraform Familiarity with QA and observability tools a plus (e.g., App Insights, Log Analytics) 🌟 Why Join Us: Remote-first and globally distributed team ROWE culture – results over hours Competitive compensation + More ❯