markets interests you, this could be the perfect opportunity to take your career to the next level! About the role: You will play a crucial role in ensuring the reliability, performance, and efficiency the companies trading platforms. This is not your average DevOps role - this position focuses on sitereliability, where you'll be troubleshooting, supporting traders … support new trading systems, continuously improving the infrastructure. • Drive automation and operational excellence by leveraging your Linux expertise, Kubernetes, and Python scripting skills. • Monitor and ensure high availability and reliability of trading applications while being on top of system alerts and incidents. Key Requirements: • 1-5 years working experience • Background working in the financial services sector, ideally supporting traders … Solid experience with Linux Systems administration and troubleshooting. • Hands-on experience with Kubernetes for container orchestration. • Proficient in Python scripting for automation and system management. • A mindset focused on sitereliability and performance. • Strong troubleshooting skills and a proactive approach to problem-solving. Salary: Up to £90,000 base salary Lucrative bonus scheme Company perks/benefits Location More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
issues. Ensure adherence to SLAs and help improve operational support efficiency. Participate in on-call rotations to provide 24/7 platform coverage. Continuously optimize monitoring, alerting, and platform reliability processes. Demonstrate a "can do" attitude, with flexibility to work occasional overtime when incidents extend beyond normal working hours. Profile Required … Skills & Qualifications Bachelor's degree in Computer Science, Information Technology, or related field (or equivalent work experience). Proven experience in technical support, sitereliabilityengineering (SRE), or platform operations. Strong knowledge of Linux/Unix and Windows environments. Familiarity with cloud platforms (Azure, GCP). Hands-on experience with CI/CD tools (Jenkins, GitHub Actions More ❯
to join a leading technology and innovation consultancy, supporting UK public sector clients in their cloud transformation journeys. This role sits within a highly skilled team dedicated to designing, engineering, and optimising Google Cloud Platform (GCP ) solutions that power large-scale, mission-critical systems. The successful candidate will play a key role in shaping cloud strategy, driving architectural excellence … technical architecture and delivery of Google Cloud solutions for public sector organisations. Design, deploy, and operate secure, scalable, and high-performing GCP environments. Provide technical leadership and mentorship to engineering teams to ensure successful project delivery. Apply deep knowledge of Google Cloud architecture and engineering to deliver enterprise-grade solutions that meet both functional and non-functional requirements. … networking (TCP/IP, subnets, load balancing, DNS). A track record of leading small technical teams, providing guidance and mentorship. Experience in sitereliabilityengineering (SRE) or IT operations, including incident response and troubleshooting. Strong problem-solving and innovation skills, with evidence of delivering technical improvements or new ways of working. More ❯
warrington, cheshire, north west england, united kingdom
Anson McCade
to join a leading technology and innovation consultancy, supporting UK public sector clients in their cloud transformation journeys. This role sits within a highly skilled team dedicated to designing, engineering, and optimising Google Cloud Platform (GCP ) solutions that power large-scale, mission-critical systems. The successful candidate will play a key role in shaping cloud strategy, driving architectural excellence … technical architecture and delivery of Google Cloud solutions for public sector organisations. Design, deploy, and operate secure, scalable, and high-performing GCP environments. Provide technical leadership and mentorship to engineering teams to ensure successful project delivery. Apply deep knowledge of Google Cloud architecture and engineering to deliver enterprise-grade solutions that meet both functional and non-functional requirements. … networking (TCP/IP, subnets, load balancing, DNS). A track record of leading small technical teams, providing guidance and mentorship. Experience in sitereliabilityengineering (SRE) or IT operations, including incident response and troubleshooting. Strong problem-solving and innovation skills, with evidence of delivering technical improvements or new ways of working. More ❯
bolton, greater manchester, north west england, united kingdom
Anson McCade
to join a leading technology and innovation consultancy, supporting UK public sector clients in their cloud transformation journeys. This role sits within a highly skilled team dedicated to designing, engineering, and optimising Google Cloud Platform (GCP ) solutions that power large-scale, mission-critical systems. The successful candidate will play a key role in shaping cloud strategy, driving architectural excellence … technical architecture and delivery of Google Cloud solutions for public sector organisations. Design, deploy, and operate secure, scalable, and high-performing GCP environments. Provide technical leadership and mentorship to engineering teams to ensure successful project delivery. Apply deep knowledge of Google Cloud architecture and engineering to deliver enterprise-grade solutions that meet both functional and non-functional requirements. … networking (TCP/IP, subnets, load balancing, DNS). A track record of leading small technical teams, providing guidance and mentorship. Experience in sitereliabilityengineering (SRE) or IT operations, including incident response and troubleshooting. Strong problem-solving and innovation skills, with evidence of delivering technical improvements or new ways of working. More ❯
such as Prometheus and Grafana Maintain strong understanding of networking , system administration , and security best practices Requirements Proven experience in a DevOps or SiteReliabilityEngineering (SRE) role Solid experience with at least one major cloud provider (AWS, Azure, GCP) Hands-on experience with Kubernetes and Docker Proficiency in Terraform , CloudFormation , or similar IaC tools Strong scripting More ❯
such as Prometheus and Grafana Maintain strong understanding of networking , system administration , and security best practices Requirements Proven experience in a DevOps or SiteReliabilityEngineering (SRE) role Solid experience with at least one major cloud provider (AWS, Azure, GCP) Hands-on experience with Kubernetes and Docker Proficiency in Terraform , CloudFormation , or similar IaC tools Strong scripting More ❯
We are hiring for SiteReliability Engineer (SRE) - Monitoring Focus, Senior Monitoring and Telemetry Specialist) Location : Knutsford - Hybrid - 2 to 3 days in Office Should have expertise in designing, implementing, and maintaining the telemetry and monitoring solutions that drive the health, performance, and reliability across diverse infrastructure Proven, hands-on experience in configuring, managing, and leveraging industry More ❯
We are hiring for SiteReliability Engineer (SRE) - Monitoring Focus, Senior Monitoring and Telemetry Specialist) Location : Knutsford - Hybrid - 2 to 3 days in Office Should have expertise in designing, implementing, and maintaining the telemetry and monitoring solutions that drive the health, performance, and reliability across diverse infrastructure Proven, hands-on experience in configuring, managing, and leveraging industry More ❯
Cheshire East, England, United Kingdom Hybrid / WFH Options
GIOS Technology
We are hiring for SiteReliability Engineer (SRE) - Monitoring Focus, Senior Monitoring and Telemetry Specialist) Location : Knutsford - Hybrid - 2 to 3 days in Office Should have expertise in designing, implementing, and maintaining the telemetry and monitoring solutions that drive the health, performance, and reliability across diverse infrastructure Proven, hands-on experience in configuring, managing, and leveraging industry More ❯
looking for a DevOps Engineer to join our growing team. Day to Day You'll Be: Infrastructure & Operations: Participate in the design, implementation, and maintenance of our infrastructure, ensuring reliability, scalability, and security. Support, monitor, and enhance the live infrastructure and platform solutions, ensuring high availability and performance. Help plan and execute the integration of our current infrastructure into … analysis. Documentation & Best Practices: Ensure comprehensive documentation of infrastructure, systems, and processes to support onboarding, troubleshooting, and scalability. Promote and implement DevOps and SiteReliabilityEngineering (SRE) best practices across the organisation. Essential Skills & Experience: Technical Expertise: Strong Linux systems administration experience, including firewalls and hardening Expertise in Docker and container orchestration. Proficiency with Infrastructure as Code … Familiarity with GCP services such as Compute Engine, Kubernetes Engine (GKE), Cloud Storage, BigQuery, and IAM. Familiarity with configuration management and IT automation tools. Strong understanding of DevOps and SRE principles. Soft Skills: Self-motivated, highly organised, and capable of driving initiatives from concept to delivery. Excellent communication and stakeholder management skills. Desirable Skills: Experience with serverless infrastructure (e.g., AWS More ❯
Overview We are seeking a skilled MongoDB Engineer to join a dynamic infrastructure engineering team based in Knutsford. This role is focused on building and maintaining scalable, secure, and reliable infrastructure platforms that support critical applications and data systems. You will apply software engineering principles, automation, and incident response best practices to ensure operational excellence across technology platforms. … Key Responsibilities Design, develop, and maintain infrastructure solutions with a focus on performance, reliability, and scalability. Monitor system performance and proactively address incidents, vulnerabilities, and outages. Implement automation using scripting languages and configuration tools to streamline operations. Ensure secure configurations and protect infrastructure against cyber threats and unauthorized access. Collaborate with product managers, architects, and engineers to align infrastructure … automation. Familiarity with DevOps tools such as Git, JIRA, and CI/CD pipelines. Strong scripting skills in Python or Bash. Understanding of SiteReliabilityEngineering (SRE) practices and incident management. Knowledge of containerization and orchestration tools such as Kubernetes. Desirable Attributes Experience in financial services or regulated industries. Ability to work collaboratively across cross-functional teams. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Anson Mccade
Professional-level Google Cloud certification required. Proven expertise in Google Cloud Platform services (Compute Engine, App Engine, GKE, Cloud Storage, IAM, VPC, etc.). Strong experience in architecting and engineering cloud-based solutions that meet both functional and non-functional requirements. Solid understanding of cloud networking and security (e.g. firewalls, encryption, identity management). Experience implementing foundational cloud platforms … communication and stakeholder management skills across technical and non-technical audiences. Desirable Experience with multi-cloud environments (AWS, Azure, hybrid). Familiarity with sitereliabilityengineering (SRE) principles and production systems support. Experience driving innovation, technical transformation, and improvements in ways of working. What's on Offer Competitive base salary between £75,000 and £90,000 . More ❯
london, south east england, united kingdom Hybrid / WFH Options
Anson Mccade
Professional-level Google Cloud certification required. Proven expertise in Google Cloud Platform services (Compute Engine, App Engine, GKE, Cloud Storage, IAM, VPC, etc.). Strong experience in architecting and engineering cloud-based solutions that meet both functional and non-functional requirements. Solid understanding of cloud networking and security (e.g. firewalls, encryption, identity management). Experience implementing foundational cloud platforms … communication and stakeholder management skills across technical and non-technical audiences. Desirable Experience with multi-cloud environments (AWS, Azure, hybrid). Familiarity with sitereliabilityengineering (SRE) principles and production systems support. Experience driving innovation, technical transformation, and improvements in ways of working. What's on Offer Competitive base salary between £75,000 and £90,000 . More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Anson Mccade
Professional-level Google Cloud certification required. Proven expertise in Google Cloud Platform services (Compute Engine, App Engine, GKE, Cloud Storage, IAM, VPC, etc.). Strong experience in architecting and engineering cloud-based solutions that meet both functional and non-functional requirements. Solid understanding of cloud networking and security (e.g. firewalls, encryption, identity management). Experience implementing foundational cloud platforms … communication and stakeholder management skills across technical and non-technical audiences. Desirable Experience with multi-cloud environments (AWS, Azure, hybrid). Familiarity with sitereliabilityengineering (SRE) principles and production systems support. Experience driving innovation, technical transformation, and improvements in ways of working. What's on Offer Competitive base salary between £75,000 and £90,000 . More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Holland & Barrett International Limited
want to hear from you! Key Responsibilities: Security Strategy: Help define and execute the Holland & Barrett cloud security strategy, partnering with platform and SiteReliabilityEngineering (SRE) teams to build robust infrastructure that supports our business. Perimeter Security: Establish platform perimeter security by implementing controls at ingress and egress points, including creating and maintaining an edge network More ❯
leveraging cutting-edge tools and infrastructure to help organisations innovate and scale. You’ll play a key part in shaping cloud adoption strategies, building secure environments, and mentoring talented engineering teams to deliver exceptional outcomes. Please note: applicants must be eligible for BPSS and SC security clearance, requiring five years of continuous UK residency at the point of application. … Cloud Architect, Professional Cloud Engineer) Strong experience with Google Cloud services such as Compute Engine, App Engine, Cloud Storage, GKE, and associated IaaS technologies Proven background in architecting and engineering cloud-based solutions to meet technical and business objectives Hands-on experience with Terraform, CI/CD pipelines, and scripting languages (e.g., Bash, Python, Perl) Solid understanding of cloud … able to convey complex ideas to technical and non-technical audiences Demonstrable experience leading small teams and providing technical mentorship Desirable: Experience in sitereliabilityengineering (SRE) or IT production system operations Background working within secure environments or public sector organisations Track record of introducing technical innovation or process improvements within delivery teams More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Speechmatics
The Role Speechmatics are seeking a SiteReliability Engineer (SRE) whose focus will be improving the reliability of our products, systems and infrastructure. You will work across teams to improve availability, scalability, performance and efficiency of our real-time AI inference APIs. You will get to work with high-scale GPU deployments spread across the world. Our … latency responses, making this is a really interesting problem space to learn about. What you'll be doing: Working with a diverse group of engineers across Speechmatics to improve reliability of our products and systems, from design through to operation in production. Taking part in incident response, postmortems and ensuring the same incident doesn't happen twice. Managing and More ❯
Engineer , youll take a hands-on and strategic role within the engineering organisation, driving reliability, scalability, and performance across mission-critical systems. Youll guide and mentor platform, SRE, and DevOps teams, ensuring operational excellence and robust automation practices throughout the organisation. A central focus of this position will be Infrastructure-as-Code (IaC) , building and scaling frameworks that … . Drive the reliability, performance, and scalability of production systems. Collaborate with Product, Engineering, and Platform teams to deliver resilient cloud infrastructure. Mentor and develop engineers across SRE, Platform, and DevOps disciplines. Support and refine CI/CD pipelines and automation frameworks. Play a key role in scaling engineering operations and evolving internal toolsets. Key Skills & Experience … Strong communication skills, with the ability to translate complex technical concepts into actionable insights for both technical and non-technical stakeholders. Why Apply? This is more than just another SRE or DevOps role its a chance to define frameworks, tooling, and culture at scale . With the backing of significant investment, youll have the freedom to introduce new approaches, influence More ❯
ll take a hands-on and strategic role within the engineering organisation, driving reliability, scalability, and performance across mission-critical systems. You’ll guide and mentor platform, SRE, and DevOps teams, ensuring operational excellence and robust automation practices throughout the organisation. A central focus of this position will be Infrastructure-as-Code (IaC) , building and scaling frameworks that … . Drive the reliability, performance, and scalability of production systems. Collaborate with Product, Engineering, and Platform teams to deliver resilient cloud infrastructure. Mentor and develop engineers across SRE, Platform, and DevOps disciplines. Support and refine CI/CD pipelines and automation frameworks. Play a key role in scaling engineering operations and evolving internal toolsets. Key Skills & Experience … Strong communication skills, with the ability to translate complex technical concepts into actionable insights for both technical and non-technical stakeholders. Why Apply? This is more than just another SRE or DevOps role — it’s a chance to define frameworks, tooling, and culture at scale . With the backing of significant investment, you’ll have the freedom to introduce new More ❯
Senior SiteReliability Coach 6 months Remote £Negotiable - INSIDE IR35 Tech Stack Multiple Platforms and Applications AWS and Azure - Cloud Mainframe skills would … be handy Latest applications on Cloud Dev Ops skills would be helpful Attitude of being part of the team and owning the outcomes Advocate - to change the culture to SRE Disclaimer: This vacancy is being advertised by either Advanced Resource Managers Limited, Advanced Resource Managers IT Limited or Advanced Resource Managers Engineering Limited ("ARM"). ARM is a specialist More ❯
firewall footprint. Development of reusable infrastructure automation patterns and augment them for a holistic automation approach to firewall policy deployment. Serve as an SME to technical network and security engineering teams on topics related to building automation solutions and how to best integrate them into existing product APIs. Responsible for assisting in the definition and build of self-service … capabilities for policy deployment. Develop automation logic to achieve desired business outcomes from a technical, compliance and security perspective. Skills required: Understanding of SiteReliabilityEngineering or Platform Engineering concepts. Strong programming ability in Golang. Solid understanding of DevOps principles and patterns alongside strong CI/CD pipeline delivery Cloud Automation and Orchestration with IaC principles … desired API endpoints. Automations skills across Jenkins , GitHub, Gitlab CI/CD Pipelines, Ansible, Terraform, etc. Experience in Network, Load Balancer and especially Firewall automation projects in architecture and engineering roles. Experience with Checkpoint technologies would be advantageous. Awareness of low code platforms to augment infrastructure workflow development. Awareness of building automation capabilities into an organization in the form More ❯
firewall footprint. Development of reusable infrastructure automation patterns and augment them for a holistic automation approach to firewall policy deployment. Serve as an SME to technical network and security engineering teams on topics related to building automation solutions and how to best integrate them into existing product APIs. Responsible for assisting in the definition and build of self-service … capabilities for policy deployment. Develop automation logic to achieve desired business outcomes from a technical, compliance and security perspective. Skills required: Understanding of SiteReliabilityEngineering or Platform Engineering concepts. Strong programming ability in Golang. Solid understanding of DevOps principles and patterns alongside strong CI/CD pipeline delivery Cloud Automation and Orchestration with IaC principles … desired API endpoints. Automations skills across Jenkins , GitHub, Gitlab CI/CD Pipelines, Ansible, Terraform, etc. Experience in Network, Load Balancer and especially Firewall automation projects in architecture and engineering roles. Experience with Checkpoint technologies would be advantageous. Awareness of low code platforms to augment infrastructure workflow development. Awareness of building automation capabilities into an organization in the form More ❯
memory caching technologies, such as Redis and/or GridGain. Experience with programming languages, including Java or Python. Understanding of API development and best practices. Advocating a culture of SiteReliabilityEngineering practices to continuously measure, optimise, and improve systems. You may be assessed on the key critical skills relevant for success in role, such as risk … applications and data systems, using hardware, software, networks, and cloud computing platforms as required with the aim of ensuring that the infrastructure is reliable, scalable, and secure. Ensure the reliability, availability, and scalability of the systems, platforms, and technology through the application of software engineering techniques, automation, and best practices in incident response. Accountabilities Build Engineering: Development … delivery, and maintenance of high-quality infrastructure solutions to fulfil business requirements ensuring measurable reliability, performance, availability, and ease of use. Including the identification of the appropriate technologies and solutions to meet business, optimisation, and resourcing requirements. Incident Management: Monitoring of IT infrastructure and system performance to measure, identify, address, and resolve any potential issues, vulnerabilities, or outages. Use More ❯