We're on a mission to democratize audio creation by building world-class audio infrastructure for our customers. As a SiteReliability Engineer, you'll play a key role in improving our platform's developer operations, including observability … monitoring, and overall reliability. You will be part of a cross-functional team dedicated to implementing robust DevOps practices and enhancing infrastructure and sitereliabilityengineering (SRE). A customer-focused mindset is essential, as the team collaborates closely with stakeholders to ensure solutions meet business and user needs. In addition to a focus on observability, you More ❯
are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability … and SOP's Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in SiteReliability Engineer roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools More ❯
South West London, London, England, United Kingdom
Oscar Technology
SiteReliability Engineer - AWS/Azure | Outside IR35 | £450-500/day … Month Contract | Paddington, London (Hybrid, 2 Days Onsite) We're working with a fast growing client undergoing rapid expansion, looking for an experienced SiteReliability Engineer (SRE) to join them on a 6-month contract (outside IR35) You'll be leading efforts acriss AWS and Azure Cloud environments, focusing on automation, observability, infrastructure as code and performance at … Monitor, Grafana, ELK) Own incident response processes, ensuring high availability and rapid resolution Collaborate with stakeholders to communicate solutions and technical trade-offs clearly Ideal Experience: 3-5 years SRE or DevOps experience across AWS and Azure platforms Strong knowledge of Terraform , scripting (Python, Bash, PowerShell), and cloud architecture Comfortable with containerisation and orchestration ( Docker, Kubernetes ) Understanding of networking, DNS More ❯
SiteReliability Engineer with Python Our Client looking to bring on a sitereliability engineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide-ranging organization. You will have at least 7 to 10 years hands-on expertise … working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition, the ideal candidate will proactively look for system weaknesses and find ways to resolve them before they can cause production issues via monitoring … planning, retros, issue tracking, etc.) -?Actively lead any critical issue post-mortem processes, including coordination of any meetings and further steps to take Qualifications -?7+ years experience with software engineering, software development, and/or system operations -?Experience debugging complex problems and implementing timely cost-effective solutions -?Experience designing, building, and operating large-scale production systems -?Deep knowledge of More ❯
SiteReliability Engineer/DevOps Engineer page is loaded SiteReliability Engineer/DevOps Engineer Apply locations Farringdon time type Full time posted on Posted 9 Days Ago job requisition id R94904 SiteReliability Engineer/DevOps Engineer Are you enthusiastic about designing and managing cloud platforms? Do you find satisfaction in ensuring the … reliability and performance of complex systems? About Team: The LexisNexis Intellectual Property (IP) division ( ) provides international patent content and a suite of online and analytic tools that meet the evolving needs of the intellectual property market. We deliver data to support LexisNexis IP search and analytics applications, empowering our customers with actionable insights and metrics for critical business decisions. … and communities. Working here means joining a vibrant, diverse, and collaborative team where you are free to grow and contribute actively. About Role: We are a high-performing systems engineering team operating in a fast-paced enterprise environment, focused on modernising our infrastructure while upholding strict security and compliance standards. Our engineers work with Microsoft Hyper-V and a More ❯
Kingdom Join Axon and be a Force for Good. Your Impact What You'll Do Location: London, England Build robust, easy-to-use foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem … solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new and improved architectural patterns. Provide robust documentation for use by engineers to promote self-service. Take calculated risks, champion new ideas, and cultivate your craft. What You Bring 7+ years of applicable experience. Experience managing cloud platforms More ❯
SiteReliability Engineer Remote - Canada, Americas/Engineering We offer The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the … radical responsibility If this sounds like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look … we expect this role to be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing for Cloud through liaising with More ❯
# SiteReliability EngineerRemote - APAC/EngineeringThe Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or … radical responsibility If this sounds like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to improve, as we will look … we expect this role to be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing for Cloud through liaising with More ❯
Job Title: Cloud Engineer/SRE - Golang & Github Location: Remote - UK, London Salary/Rate: Up to £604 a day Inside IR35 Start Date: July 2025 Job Type: 12-Month Contract Company Introduction: We are seeking a highly skilled Cloud Engineer/SRE with Development experience in Go and Github to join our client in the Global Analytical Risk sector. … GitHub Actions (designing complex workflows, custom actions) o GitHub Enterprise, Organisation and Repository settings. Operations/Infrastructure Background: Proven experience in an operations, sitereliabilityengineering (SRE), or infrastructure engineering role, with a strong appreciation for automation and stability. Modern SDLC Practices: Familiarity with: o Dependency management. o Security remediation processes and secure coding practices. o More ❯
Systems Engineering Lead (Lead Ops Engineer) page is loaded Systems Engineering Lead (Lead Ops Engineer) Apply locations London time type Full time posted on Posted 21 Days Ago job requisition id R95240 About our Team: Our global team supports the production infrastructure which underpins a range of the company's products and services. These services form the core … our customers. We have a stable, well-established product that we continuously maintain and improve. Our team values trust, respect, collaboration, agility, and quality. About the Role: The Systems Engineering Lead is responsible for guiding a dedicated team of Systems Engineers. The incumbent will contribute to designing optimal system configurations, planning hardware installations and upgrades within complex systems and … and orchestration tools (e.g., Docker, Kubernetes/EKS). Skilled in scripting languages such as Python, Bash, TypeScript, and PowerShell. Familiarity with DevOps & SiteReliabilityEngineering (SRE) principles , practices , and tools . Hands-on experience with monitoring and logging solutions (e.g., New Relic, Coralogix , AWS CloudWatch, Azure Monitor). S t rong problem-solving , stakeholder management , written More ❯
in high-impact delivery teams that support some of the worlds most well-known organisations. Youll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estatescontributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools such … mindset with a passion for continuous improvement and knowledge sharing Certifications Dynatrace Associate & Pro Splunk Core Certified Power User Desirable Experience DevOps or SiteReliabilityEngineering (SRE) experience Automation with Terraform or similar tools Building CI/CD pipelines Experience with Docker and Kubernetes for packaging and deployment Ability to adapt to new technologies in fast-paced More ❯
impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools … A proactive mindset with a passion for continuous improvement and knowledge sharing Certifications Dynatrace Associate & Pro Splunk Core Certified Power User DevOps or SiteReliabilityEngineering (SRE) experience Automation with Terraform or similar tools Experience with Docker and Kubernetes for packaging and deployment Ability to adapt to new technologies in fast-paced environments Please note that all More ❯
additional/ad hoc duties as required to meet the needs of the business. Experience/Competences Deep and broad experience of AWS Cloud platform and services DevOps and SRE principles Very good working knowledge of incorporating testing into CI/CD pipelines Understanding of various deployment patterns such as blue-green and canary Platforms; Windows Server, Amazon Linux, RHEL More ❯
The role We're on the lookout for a SiteReliability Engineer (SRE) with a thirst for innovation and a desire to establish Operational Excellence and best practices. You'll be instrumental in fortifying the backbone of our AI-driven autonomous vehicles, ensuring they're robust, resilient, and ready to revolutionise urban mobility. This role isn't just … Champion automation to continuously elevate our efficiency, aiming to make manual interventions a thing of the past. About you In order to set you up for success as an SRE at Wayve, we’re looking for the following skills and experience. Essential Over 8 years experience in SiteReliabilityEngineering or a similar role, especially in a More ❯
generating results that allow our clients to thrive. What You'll Do The Senior Director - Operations and ReliabilityEngineering is responsible for blendingSite ReliabilityEngineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core … agility and operational resilience. Establish workforce development programs forAI-driven operations, automation, and modern reliability practices. What You'll Bring Required Qualifications: 15+ years of experiencein IT operations, SRE, DevOps, or platform engineering. 5+ years in a senior leadership role, managinglarge-scale IT environments. Deep technical expertise incloud computing (AWS, Azure, GCP), on-prem infrastructure, and hybrid environments. Proven … remediation. Strong understanding ofzero-trust security, regulatory compliance, and risk management. Excellent leadership, communication, and stakeholder management skills. Preferred Qualifications: Certifications:ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or equivalent. Experience withKubernetes, Terraform, Ansible, and AI-powered operations tools. Strong problem-solving abilities, with a data-driven approach to operational excellence. TheSenior Director - Operations Platform Leadis More ❯
Location: London, England, United Kingdom Join Axon and be a Force for Good. As an SRE contributor in Axon's Real Time Operations organization, you are passionate about delivering solutions to the real-time problems our mission-critical cloud native services encounter. You are also obsessed about achieving the high quality and reliability our customers demand. You will work … closely not only with your peers, but also the RTO engineering teams, allowing your technical deliverables to reach the entire engineering organization, enabling product teams to continuously deliver features on the vanguard of innovation and helping scale our products to thousands of agencies around the world. What You'll Do Location: London UK Build robust, easy-to-use … foundational platforms and tools that enable engineering teams to provision services rapidly, consistently, and securely. Exemplify cloud-native sitereliability best practices. Write code that is performant, maintainable, clear, and concise. Employ strong problem-solving skills, with the ability to debug problems in cloud-native distributed systems. Influence and educate the engineering organization to adopt new More ❯
SiteReliability Engineer London (Blackfriars) – 7 monthsCertain Advantage are recruiting on behalf of our prestigious Financial Services client for an SRE Engineer in their AWS DB team who support numerous native DBs like RDS/Aurora/Neptune plus CockroachDB.This is a contract position for 7 months working inside IR35.Lead SiteReliability Engineers (SRE) play an … support is required outside of working hours Participate in enhancing product observability and telemetry, support modernization. Brainstorm ideas to simplify and streamline infrastructure by closely working with infrastructure and SRE teams. Required qualifications, capabilities and skills Knowledge of Python/Unix Shell scripting & SQL. Good understanding of development tools: source code control software, automated build, automated testing and JIRA. Understanding … of IaC infrastructure as a code concept is desirable. Experience with build automation, test driven development, continuous integration and delivery Experience with Relational and non Relational Databases Previous SRE experience including knowledge about SLO/SLA/SLI and error budgets, is advantageous Experience working or familiarity with one public cloud (AWS, Google or Azure) Preferred skills – what’ll get More ❯
Senior Software Engineer, ReliabilityEngineering London At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status … Champion automation to continuously elevate our efficiency, aiming to make manual interventions a thing of the past. About you In order to set you up for success in the reliability team at Wayve, we’re looking for the following skills and experience. Over 8 years experience in SiteReliabilityEngineering or a similar role, especially in More ❯
analysts, and support staff. Overview: We are looking for a highly skilled and visionary leader to join our team as the Head of SiteReliabilityEngineering (SRE) with a strong focus on AWS cloud infrastructure. The ideal candidate will have a deep understanding of cloud architectures, extensive experience in SRE practices, and the ability to lead and … scale SRE teams to ensure the availability, performance, and security of our systems. Key Responsibilities: Leadership and Team Management: Lead and manage the SRE team to ensure high availability, scalability, and performance of our AWS-based infrastructure. Provide mentorship and guidance to junior and senior engineers, fostering a culture of operational excellence and continuous improvement. Cloud Infrastructure Management: Oversee the … design, implementation, and maintenance of cloud infrastructure in AWS, ensuring the systems are secure, reliable, and highly available. Use best practices for AWS services, automation, and monitoring. SRE Practices Implementation: Establish and lead the implementation of SRE principles, such as Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets, to drive the team's focus on reliability. Incident More ❯
Are you an experienced Senior DevOps/SiteReliability Engineer looking for your next contract role? Join one of the world's leading IT services, consulting, and business solutions organization. Founded in 1968, the company consistently ranks among the top global IT service providers. With a presence in over 50 countries, the company has built a reputation for … across industries including banking, healthcare, telecommunications, and retail. The leading consultancy firm has partnered with a global technology leader and they are currently seeking an experienced Senior DevOps/SiteReliability Engineer to join the team. Additionally, this role provides a hybrid working arrangement based in London. Ready to make a move? Get in touch and apply today More ❯
Engineering Head - Public Cloud Infrastructure Services - Director Join to apply for the Engineering Head - Public Cloud Infrastructure Services - Director role at Citi Engineering Head - Public Cloud Infrastructure Services - Director Join to apply for the Engineering Head - Public Cloud Infrastructure Services - Director role at Citi About The Opportunity Are you a seasoned technology leader with a passion … for building cutting-edge enterprise products and a hands-on approach to engineering? Join Citi's Cloud Technology Services (CTS) team and be part of our commitment to transform Citi technology leveraging game-changing Cloud capabilities to drive agility, efficiency, and innovation. About The Opportunity Are you a seasoned technology leader with a passion for building cutting-edge enterprise … development background is highly desirable, specifically architecting or developing microservices using Java/Spring Boot to automate infrastructure deployment. Agile and DevOps Mindset: Familiarity with Agile Development, DevOps, and SRE practices. Strategic Thinking: Experience evaluating complex requirements and rationalizing them into a consistent service offering. Leadership Experience: A proven track record of managing a diverse, inclusive, and high-performing EngineeringMore ❯
The Senior Software Engineerworks with our Global Content Delivery teams to deliver exabytes of content for our brands globally. The Senior Software Engineer has a highly skilled combination of engineering and operations skills and is focused on automating and improving operations. Your job is to guarantee system reliability, performance, and supportability with a strong engineering emphasis on … building autonomous solutions that deliver value to end-users early, often, and fast. You are central to the reputation and trustworthiness of our services and advocate for engineering best practices. WBD’sGlobal Content Delivery team members are responsible for the management and governance of all media delivery infrastructure for Max, D+, CNN, and other WBD brands globally. You will … external). Qualifications and Experience... At least 5 years of experience within Software Engineering and proficient in one of the following: Golang, Python, Java or Node.js Passionate about SRE, DevOps, Automation, and infrastructure platforms. Must excel with agile and lean development practices and manage multiple priorities and multiple roles. BS Degree in Computer Science, Physics, Engineering or Mathematics More ❯
this business-critical environment. This senior leadership role will report to the Head of FIC Production & Reliability Engineering. You will spearhead the SiteReliabilityEngineering (SRE) function across FIC Technology—defining and executing the strategic vision— while remaining embedded in managing and accountable for day-to-day production stability in the Rates & Credit business. You will … be responsible for the resilience, scalability, and performance of mission-critical trading and risk systems—particularly within the Rates & Credit business—while also influencing SRE practices across FX, Repo, Emerging Markets, Listed Derivatives and other Core Platforms. You will work closely with senior business and technology stakeholders to shape the future of Production Engineering, while remaining deeply engaged in … the option to purchase additional days The opportunity to support a wide ranging CSR programme + 2 days’ volunteering leave per year Your Key Responsibilities Define and drive the SRE strategy across FIC Technology, aligning reliability goals with business priorities and regulatory expectations Lead the transformation of production support into a proactive, data-driven engineering discipline focused on More ❯
SiteReliability Engineer (DV Security Clearance) Position Description CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent pension, private healthcare, plus a share scheme (3.5% + 3.5% matching) which makes you a member not … agencies most challenging problems. Our teams work alongside our clients to help them understand how to exploit technologies to maintain competitive advantage. Our systems are engineered for performance, security, reliability and scalability; built with modern CI and CD tooling and techniques. We are currently looking for an experienced cloud infrastructure engineer to join our team - being able to think … all of the skills we need, we would consider high quality individuals who meet most of the criteria. Required qualifications to be successful in this role • Background in Software Engineering, including the development of automation scripts, infrastructure as code, creating tooling or frameworks and feature development, ideally using Java and/or python. • Experience of engineering enablement products More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
BOSS Professional Services LTD
SRE Engineer Full-time UK - Remote/Hybrid My client is a high growth ecommerce business which runs it technology stack on AWS. Due to the nature of the business the SRE Engineer will need to support sudden peaks in traffic smoothly scaling. They also host other ecommerce platform for other brands which also need supporting. As an SRE Engineer … you will maintain a scalable and reliable production environment for running software services while helping grow the customer base and product offering. For the SRE Engineer role we are seeking: Technology stack: Kubernetes, MySQL, PostgreSQL, PHP, Python, Docker, AWS Lambda, AWS, Redis, ELK, monitoring: Prometheus, Grafana or Loki You have previous experience of working within SRE capacity or experience in … Assist and support the DevOps engineers: setting up the infrastructure for microservices Work closely with rest of the DevOps and QA team to load test applications Responsibilities for the SRE Engineer include: Create sustainable systems and services through automation and uplifts Partner with development teams to improve services Gather and analyse metrics from both operating systems and applications Participate in More ❯