Welwyn Garden City, England, United Kingdom Hybrid / WFH Options
PayPoint plc
Cover on-call rotation for production support (1 week out of 6) As well as making improvements to: • Deployment automation and release management processes • Application and infrastructure monitoring and observability • Security scanning and vulnerability management in pipelines • Performance optimization and capacity planning • Development team productivity through tooling and automation What we would like from you • Strong experience with CI/ More ❯
EC2, VPC, etc.) ⚙️ Strong IaC skills with Terraform and CI/CD pipelines 🐳 Kubernetes operations expertise on AWS (EKS) 🔒 Solid grounding in Linux, networking, and cloud security 📊 Familiarity with observability stacks (Prometheus, Grafana, Loki) If you’re ready to shape the infrastructure behind cutting-edge AI used by global enterprises, we’d love to hear from you. More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
Morson Edge
to become a leader as they scale What they're looking for... Multi Cloud experience - GCP, AWS, Azure Experience across all modern DevOps tech within IaC, CI/CD, Observability, containers etc Strong experience within software development, not just infrastructure within Go or Python Have been within a leadership or architect role, building from scratch and have a track record More ❯
Cheshire East, England, United Kingdom Hybrid / WFH Options
Accelero
or ARM templates. Experience with automation and scripting using PowerShell, Bash, or Python. Strong knowledge of cloud security practices and governance models within Azure environments. Experience with monitoring and observability tools such as Azure Monitor, Log Analytics, Prometheus, or Grafana. Strong troubleshooting and analytical skills, particularly in complex cloud and networked environments. If you're interested in the role, please More ❯
of investment into the latest tech & AWS tools What they're looking for... Strong experience within AWS & AWS services within networking and security Proficient within Terraform, CloudFormation or Ansible Observability tools like Cloud Watch, CloudTrail, OpenSearch Grafana/Kinesis Have a background within core infrastructure services like networking, security, patching and has transitioned to a Platform/Cloud focused Engineer More ❯
and resource structures. Assist with integration between Azure-based CCaaS and D365 solutions and existing AWS-hosted applications. Set up, configure and maintain DevOps pipelines for solution deployment. Support observability tooling and dashboards for monitoring platform health and performance. Act as a technical liaison between stakeholders, programme leadership and engineering teams. Provide 3rd line support for the CRM and CCaaS More ❯
Stevenage, Hertfordshire, England, United Kingdom Hybrid / WFH Options
MBDA
stakeholders to meet the ever-evolving challenges of the cyber threat landscape. Key responsibilities include; Act as the subject matter expert (SME) for Splunk across all cyber security and observability use cases. Lead SOC automation initiatives using scripting and SOAR tools, optimising processes through AI and ML technologies. Support alert tuning, connectivity, and visibility across monitored networks and infrastructure. Maintain More ❯
Stevenage, Hertfordshire, South East, United Kingdom Hybrid / WFH Options
MBDA
stakeholders to meet the ever-evolving challenges of the cyber threat landscape. Key responsibilities include; Act as the subject matter expert (SME) for Splunk across all cyber security and observability use cases. Lead SOC automation initiatives using scripting and SOAR tools, optimising processes through AI and ML technologies. Support alert tuning, connectivity, and visibility across monitored networks and infrastructure. Maintain More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Speechmatics
postmortems and ensuring the same incident doesn't happen twice. Managing and improving GitOps release workflows and CI/CD pipelines. Monitoring system performance and troubleshooting production environments. Implementing observability improvements using OpenTelemetry tooling. Automating processes that reduces manual efforts and creates self-healing systems. Taking part in on-call rota for production systems that has a generous daily pay … dive deep into new technologies; you thrive on learning as you go. Prior experience with on-call rotations and incident response is a plus. Familiarity with OpenTelemetry and related observability tooling is advantageous. We encourage you to apply even if you do not feel you match all of the requirements exactly. The list of requirements is intended to show the More ❯
Chelmsford, Essex, United Kingdom Hybrid / WFH Options
Brooks Automation, Inc
infrastructure and security services, ensuring operational excellence and incident response readiness. Partner with the CISO to shape long-term strategy and roadmap for secure, resilient IT services. Drive automation, observability, and scalability across the infrastructure and security stack. Serve as a key escalation point for technical troubleshooting and security event resolution. Guide vendor selection, contract negotiations, and service-level adherence More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Willis Towers Watson
Description We are looking for an experienced Site Reliability Engineer to join the Igloo team in Cambridge to champion observability and delivery. The candidate should have strong communication skills, experience in coaching or sharing knowledge, and proficiency in Azure and Observability platforms. Join Insurance Consulting and Technology (ICT) during a transformative period aimed at enhancing customer and business value. You … new and exciting uses of their technology. This role will have the opportunity to help the team and product deal with exciting, complex and large-scale client propositions where observability will be essential and help transform how the product is designed and deployed. You will join a cross-team guild of Site Reliability Engineers, which enables you to not only … influence direction within your product family, but to also help shape how we handle observability and monitoring across ICT. This role is open to flexible and hybrid working arrangements, with presence in the Cambridge office a minimum of two days per week. The Role: Collaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing More ❯
of the digital landscape across the business. We’re looking to grow our data team with a skilled, proactive Data Engineer who is keen to establish strong data governance, observability practices-ensuring datasets are versioned, catalogued and fully traceable from source to output while sharing their knowledge within the data team. Working closely with the IT Team, vendors and business … . Maintain a medallion architecture (Bronze–Gold) for trusted, refined datasets. Develop, optimize, and maintain complex SQL queries to support analytics and reporting requirements. Implement data quality, testing and observability; ensure lineage, accuracy and compliance. Enable self-serve analytics through well-documented models and transformation logic. Integrate internal/external sources. Manage data infrastructure (warehouses, data lakes, storage); tune performance … and monitor health. Troubleshoot incidents, run root-cause analysis and deploy fixes and provide technical support. You will apply best practices for data quality, testing, and observability, helping to ensure the data delivered to stakeholders is accurate and trustworthy. Contribute to CI/CD practices, documentation and engineering standards. Partner cross-functionally to deliver fit-for-purpose data solutions. Proactively More ❯
lead performance testing and chaos engineering initiatives, and embed reliability best practices across engineering, DevOps, and infrastructure teams. This is a senior, strategic leadership role focused on system excellence, observability, and continuous improvement. Ideal Candidate: Proven experience leading Performance Engineering, Reliability, or SRE functions Deep expertise in performance testing methodologies (load, stress, spike, soak) Strong hands-on background with LoadRunner … strategy across critical platforms and services Oversee load, stress, and chaos testing initiatives to ensure systems perform and recover under real-world conditions Define and drive best practices for observability, monitoring, and APM adoption using tools like Dynatrace Drive incident reduction, faster recovery (MTTR) , and continuous reliability improvements Champion a culture of performance ownership , ensuring teams build with scalability, stability More ❯
Employment Type: Full-Time
Salary: £84,000 - £95,000 per annum, Negotiable, Inc benefits
lead performance testing and chaos engineering initiatives, and embed reliability best practices across engineering, DevOps, and infrastructure teams. This is a senior, strategic leadership role focused on system excellence, observability, and continuous improvement. Ideal Candidate: Proven experience leading Performance Engineering, Reliability, or SRE functions Deep expertise in performance testing methodologies (load, stress, spike, soak) Strong hands-on background with LoadRunner … strategy across critical platforms and services Oversee load, stress, and chaos testing initiatives to ensure systems perform and recover under real-world conditions Define and drive best practices for observability, monitoring, and APM adoption using tools like Dynatrace Drive incident reduction, faster recovery (MTTR) , and continuous reliability improvements Champion a culture of performance ownership , ensuring teams build with scalability, stability More ❯
Peterborough, Cambridgeshire, England, United Kingdom Hybrid / WFH Options
Noir
Performance & Reliability Director - Software House - Peterborough/Hybrid (Key skills: Performance Engineering, Reliability Engineering, SRE, Load Testing, Observability, Chaos Testing, Cloud Platforms, Microservices, Leadership, CI/CD, APM Tools) Are you a technology leader passionate about driving performance, scalability, and reliability across complex software platforms? Do you thrive in high-growth environments where innovation, engineering excellence, and resilience are core … lifecycle. You'll oversee system profiling, capacity planning, and test strategies - ensuring every release meets the highest standards for speed, scalability, and reliability. You'll drive the adoption of observability and monitoring frameworks, leveraging platforms like Datadog and Dynatrace to build a proactive performance culture. You'll champion continuous improvement, implement chaos testing programmes, and ensure teams deliver fault-tolerant More ❯
St. Albans, Hertfordshire, South East, United Kingdom
Method-Resourcing
Software Engineering Lead £90,000 + Equity Method is working with a purpose-driven technology company on a multi-year transformation to rebuild its core platform into a modern, event-driven microservices architecture. Their mission is to improve safety, efficiency More ❯