are seeking Senior Cloud & Application Security Engineers to help our client define and implement its cloud security strategy. If you're an experienced Security Engineering professional excited to work with cutting-edge technology and collaborate with diverse teams, we want to hear from you! Key Skills: Strong understanding of … manage multiple security projects effectively. Responsibilities Security Strategy: Define and execute cloud security strategy, partnering with platform and SiteReliabilityEngineering (SRE) teams to build robust infrastructure that supports our business. Perimeter Security: Establish platform perimeter security by implementing controls at ingress and egress points, including creating … security services, including certificate authorities, encryption services, insecure configuration scanners, and security control canaries. Key Requirements: Essential: 5+ years of experience in cloud security engineering, particularly with AWS and Azure, and at least 2+ years in software development. Desired: Ability to work independently, take initiative, and maintain a keen More ❯
london, south east england, united kingdom Hybrid / WFH Options
Nationwide Building Society
Senior Application Engineer London or Swindon Office Hybrid role - x2 days on site/x3 Work from home Nationwide is leveraging the power of Cloud, DevOps and Agile to bring teams together and create compelling Digital experiences for members of today and tomorrow. At the same time, we’re … solutions Knowledge of Financial services and the design and delivery of Conversational Banking solutions Knowledge or interest in SiteReliabilityEngineering (SRE) principles Our customer first behaviours put customers and members at the heart of how we work together. They are the set of behaviours that every More ❯
Senior Application Engineer London or Swindon Office Hybrid role - x2 days on site/x3 Work from home Nationwide is leveraging the power of Cloud, DevOps and Agile to bring teams together and create compelling Digital experiences for members of today and tomorrow. At the same time, we’re … solutions Knowledge of Financial services and the design and delivery of Conversational Banking solutions Knowledge or interest in SiteReliabilityEngineering (SRE) principles Our customer first behaviours put customers and members at the heart of how we work together. They are the set of behaviours that every More ❯
which pronouns you use (For example: she/her, he/him, they/them, etc). At Bumble, SiteReliability Engineers (SRE) are responsible for ensuring the reliability, scalability and performance of software systems while bridging the gap between development, security and operations. We proactively manage … infrastructure provisioning. Monitor system health and performance, identifying and fixing issues Respond to system outages, troubleshooting root causes and implementing preventative measures Collaborate with engineering teams and security engineers to improve system reliability, security and performance Participate in on-call rotations Create and maintain documentation to improve knowledge … must Proficiency in at least Python or Golang programming languages Experience with CI/CD pipelines Strong proficiency with Kubernetes architecture Prior experience in SRE, System administration or DevOps roles Strong proficiency with Linux/Unix operating systems, including hands-on experience in configuration and troubleshooting Proficiency with using Puppet More ❯
enable and empower industries at a global scale. About the Team: The global Production Operations group is integral to ensuring the operational stability and reliability of our … worldwide 24x7 on-premises and cloud environments. As the first line of defense this team has ownership of operations engineering. Collaborating closely with IT, SRE, Network, and Data engineering teams, and key stakeholders across business, product, and software engineering teams. We play a crucial role in maintaining systems … issues, providing both internal and external teams with technical support and ensuring the issue remains in custody until resolution. Collaborate with product and software engineering teams to relay operational insights and requirements. Automation, Tooling & Research Continuously identify opportunities for optimization and present findings to technical leads and management. Research More ❯
leading organizations, like Samsung and Toyota, trust MongoDB to build next-generation, AI-powered applications. We are looking for an experienced Lead for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Lead SRE, you will be very hands-on technically while also directly … adheres to the highest security standards. They build essential security infrastructure and implement controls that reinforce the platform's security posture. This is an SRE team, which means you can expect a highly hands-on approach, tackling the technical challenges of implementing large scale solutions. This team is deeply involved … implement, and manage cloud-native security tools and platforms for endpoint security, identity management (IAM), and CSPM Qualifications: Experience: 7+ years of experience in SRE, infrastructure engineering or similar role, with a strong focus on security work, with ideally 2+ years in a leadership or senior engineering role More ❯
A prestigious, technology-driven hedge fund is seeking a highly skilled SiteReliability Engineer (SRE) to join their global infrastructure team. This is a unique opportunity to work in a high-performance, low-latency trading environment where technology is at the heart of the firm’s competitive edge. … critical role in ensuring the performance, reliability, and scalability of the systems that power the fund’s trading and research platforms. As an SRE, you will work closely with software engineers and investment teams to build automation-first solutions that support the firm’s most advanced strategies. Key Responsibilities … across the business. Design and implement automation to eliminate manual tasks and reduce operational risk. Collaborate with software and investment teams to embed the SRE mindset early in the development lifecycle. Ideal Candidate: SRE with experience working with data systems Ability to program (structured, OOP, and TDD) using one or More ❯
A prestigious, technology-driven hedge fund is seeking a highly skilled SiteReliability Engineer (SRE) to join their global infrastructure team. This is a unique opportunity to work in a high-performance, low-latency trading environment where technology is at the heart of the firm’s competitive edge. … critical role in ensuring the performance, reliability, and scalability of the systems that power the fund’s trading and research platforms. As an SRE, you will work closely with software engineers and investment teams to build automation-first solutions that support the firm’s most advanced strategies. Key Responsibilities … across the business. Design and implement automation to eliminate manual tasks and reduce operational risk. Collaborate with software and investment teams to embed the SRE mindset early in the development lifecycle. Ideal Candidate: SRE with experience working with data systems Ability to program (structured, OOP, and TDD) using one or More ❯
the role: We are looking for a highly capable and experienced SiteReliability Engineer to join our growing tech team. As an SRE you will be a hands-on coach for the development teams maintaining and improving our solutions' reliability. You will be part of our DevOps team … but spend most of your time working closely with the engineering teams. Our ideal candidate will be passionate about best practices within technology teams, fully supportive of what the group is doing, and who wishes to make a difference. Responsibilities: Work with the development teams to build robust and More ❯
Accreditation Council for Graduate Medical Education
Your Impact As a contributor in the APX SRE organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with … the APX SRE organization, but your technical deliverables will reach the entire engineering organization to enable product teams to continuously deliver features on the vanguard of innovation. What You'll Do Location: London, England. Build robust, easy-to-use foundational platforms and tools that enable engineering teams to … provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new More ❯
are seeking an experienced Platform Engineering leader with a hands-on engineering background, who can articulate the business benefits that Observability and SRE provide to our clients and take on the responsibility of handling client engagements from both technical and business perspectives. Requirements: We are ideally looking for … someone with a strong background and experience in the following: Observability and SRE Practices: In-depth understanding of observability and SiteReliabilityEngineering practices. Familiarity with tools in the LGTM stack (Loki, Grafana, Tempo, Mimir) or equivalent observability platforms. Containerisation: Strong experience building and managing containerised applications … We help brands across the globe design and build innovative products, platforms, and digital experiences for the modern world. By integrating experience design, complex engineering, and data expertise-we help our clients imagine what's possible, and accelerate their transition into tomorrow's digital businesses. Headquartered in Silicon Valley More ❯
Shazam SiteReliability Engineers are not just responsible for making sure all services and systems that Shazam relies on are operating at their highest level; they're also responsible for helping development teams embrace these principles … as they develop software. Shazam SREs embed themselves with development teams and act as extensions of those teams to propagate best practices. As an SRE, you'll collaborate with development teams to help them understand the bigger picture of distributed systems, beyond individual components. We are strong believers in ownership … with software engineers being responsible for the code they write. The SRE team helps build the competencies across teams to ensure we build scalable and supportable systems. This role sits in our London office reporting to our Head of SRE. The successful candidate will be assisting multiple development teams based More ❯
SiteReliability Engineer - Field Operations London, UK C3 AI (NYSE: AI), is the Enterprise AI application software company. C3 AI delivers a family of fully integrated products including the C3 Agentic AI Platform, an end-to-end platform for developing, deploying, and operating enterprise AI applications, C3 AI … to streamline system updates and upgrades. Set up critical infrastructure, tools, and framework to streamline the deployment cycle. Work cross-functionally with Services and Engineering teams. Qualifications: Bachelor's degree in a Science, Technology, Engineering or Mathematics (STEM), or comparable area of study. Demonstrated experience in deploying, managing More ❯
Accreditation Council for Graduate Medical Education
drive real change. Constantly grow as you work hard for a mission that matters at a company where you matter. Your Impact As an SRE contributor in the Infrastructure CICD group, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You … are also obsessed about achieving the high quality and reliability our customers demand. You will work closely not only with other SRE's and SWE's, but your technical deliverables will reach the entire engineering organization to enable product teams to continuously deliver features on the vanguard of … innovation. What You'll Do Location: London, England Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem More ❯
empowering development teams by creating toolchains, guidelines, and standards. Our focus is on enabling seamless automation and CI/CD, comprehensive observability, and unwavering reliability in a secured cloud-native environment. The Opportunity The Staff Engineer position within the Platform As a Service team offers a compelling opportunity for … utilisation, enhancing fault tolerance, and ensuring the platform's ability to meet evolving demands efficiently and effectively. You provide guidance and mentorship to other SRE team members, helping them to develop their skills and knowledge of best practices in sitereliability engineering. You establish and enforce engineering … organization. You collaborate with senior leadership to shape the vision and direction of the company (cloud) infrastructures, and you help drive the development of SRE-specific strategies and initiatives that align with business objectives. You build and maintain strong relationships with stakeholders across the organization, and you represent the SREMore ❯
eDV SiteReliability Engineer Looking for an eDV SRE. Someone with a defence industry specialism with a passion … for creating efficient and secure cloud infrastructure. You will play a critical part in transforming and enhancing both internal and external operations through effective SRE practices. Core Responsibilities Infrastructure Excellence: Design, manage, and evolve our cloud-based infrastructure to support high-traffic applications and seamless service delivery. Secure Deployment: Develop More ❯
eDV SiteReliability Engineer Looking for an eDV SRE. Someone with a defence industry specialism with a passion … for creating efficient and secure cloud infrastructure. You will play a critical part in transforming and enhancing both internal and external operations through effective SRE practices. Core Responsibilities Infrastructure Excellence: Design, manage, and evolve our cloud-based infrastructure to support high-traffic applications and seamless service delivery. Secure Deployment: Develop More ❯
You will need to login before you can apply for a job. SiteReliability Engineer, Simple Storage and Glacier team (S3G) Sector: Engineering Role: Professional Contract Type: Permanent Hours: Full Time DESCRIPTION Managing trillions of objects in storage, retrieving them in sub-x ms, building software that … scale of the exciting problems you will find every day working in Simple Storage Service (S3) and Glacier. The Region Services S3 and Glacier Engineering team are looking for a talented engineer who is motivated to solve complex challenges, yet are not constrained by "how things are usually done … Services around the globe, we need exceptionally motivated people who are driven by learning and innovation. Key job responsibilities Be actively involved in daily engineering activities, providing hands-on technical guidance and support. Define architecture, design, and proof-of-concept efforts for end-to-end project delivery, ensuring high More ❯
relating to technology risks. THE ROLE & RESPONSIBILITIES This role will be responsible to continuously identify, monitor, measure, assess, and challenge operational risk for the Engineering Division. As a senior Technology Operational Risk Lead, you will be responsible for providing independent oversight and challenge of the first line of defense … 1LoD) technology risk management practices. The Engineering Organization includes the Engineering Division and technology and strategist groups in Revenue and Federation divisions. Our engineers are responsible for building and deploying innovative technical and quantitative solutions for our clients and our firm. Assess the governance of risk management practices … application, infrastructure, and platforms. Participate in key governance, steering groups and control forums. This role requires an energetic self-starter that can liaise with Engineering teams and business both regionally and globally. Experience and knowledge in a financial institution's technology infrastructure/applications and control requirements are required More ❯
Senior SiteReliability Engineer - (Networks, AWS & Kubernetes) (BH-48405-2) Location: London, England Sector: IT Salary: £90,000.00 to £120,000.00 per annum Benefits: + 15% bonus + car allowance A truly unique opportunity to help launch a brand new team within a global financial services provider. This … skilled Full Stack Infrastructure Engineers will cover Compute, Storage, Network, and Cloud technologies. You will help design, implement, and manage robust infrastructure solutions, ensuring reliability, scalability, and performance. Requirements: Proven experience managing and optimizing a diverse infrastructure stack. Extensive knowledge of cloud platforms (AWS, Azure, GCP) and infrastructure as … pipeline management and DevOps practices. Strong understanding of disaster recovery and business continuity planning. Experience with performance tuning and capacity planning. Understanding of chaos engineering principles and practices. Skills in cost optimization for cloud infrastructure. Specific Tools and Techniques: Experience in using cloud native monitoring tools like AWS CloudWatch More ❯
SiteReliability Engineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS FinTech … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
SiteReliability Engineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS FinTech … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliability Engineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS FinTech … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
East London, London, United Kingdom Hybrid / WFH Options
Future Talent Group
SiteReliability Engineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS FinTech … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯
SiteReliability Engineer – FinTech/Global Payments – London HQ/Remote First Salary - £80,000/£85,000 + Bonus Location - This UK-based team offers a fully remote working option, with a headquarters in Central London. In this role, you will be joining a leading SaaS FinTech … market. The business aims to scale its platform significantly over the next few years to support a growing international client base. Responsibilities Champion core SRE practices: define SLIs/SLOs/SLAs, reduce toil through automation, and plan for Disaster Recovery. Refine KPIs to support data-driven decisions around reliability … teams to build resilient, observable, and maintainable features. Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go More ❯