excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
Services Provider who are looking for a SiteReliabilityEngineer to join their team where you will manage and guide the SRE team. As a SiteReliabilityEngineer, you will be responsible for: Manage and guide the SiteReliability Engineering (SRE) team … while promoting SRE principles throughout FCA product groups. Serve as the SRE subject matter expert and strategic lead within the delivery organization. Oversee and advise on daily operations related to observability tools, including their upkeep and optimization. Provide hands-on support to engineering teams for delivering observability initiatives, as needed. … the reliability of quality assurance results. Proven skills and experience to help you succeed in this role: Strong Experience with primary role of SREEngineer Strong experience in Devops Tools (Git Hub, Git Hub Actions, Workflow, CodeQL Jenkins, Nexus, CloudFormation/Terraform etc.) Strong experience in monitoring tool More ❯
Press Tab to Move to Skip to Content Link Select how often (in days) to receive an alert: SiteReliabilityEngineer (SRE) - Consultant - Digital Factory At Capgemini Invent, we believe difference drives change. As inventive transformation consultants, we blend our strategic, creative and scientific capabilities, collaborating closely … and data. Superpowered by creativity and design. All underpinned by technology created with purpose. YOUR ROLE As a SiteReliabilityEngineer (SRE), you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will include building strong observability … practices, aligning with the SRE mindset & principles, and driving continuous improvement. This will involve: Defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and maintain system and application performance, ensuring services meet agreed reliability targets. Instrumenting applications to collect key metrics, logs, and traces More ❯
City Of London, England, United Kingdom Hybrid / WFH Options
Fruition Group
Job Title: Senior SiteReliabilityEngineer (SRE) Location: Central London (Hybrid - c. 1-2 days per week) Salary: £80,000 - £100,000 + benefits Why Apply? This is a fantastic opportunity for a seasoned Senior SiteReliabilityEngineer to take a lead role in … most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, high availability, and security. Design and implement robust CI/CD pipelines for efficient product delivery. … like GitlabCI, Terraform/OpenTofu, Ansible, and scripting languages such as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud-native environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption More ❯
london (city of london), south east england, United Kingdom Hybrid / WFH Options
Fruition Group
Job Title: Senior SiteReliabilityEngineer (SRE) Location: Central London (Hybrid - c. 1-2 days per week) Salary: £80,000 - £100,000 + benefits Why Apply? This is a fantastic opportunity for a seasoned Senior SiteReliabilityEngineer to take a lead role in … most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, high availability, and security. Design and implement robust CI/CD pipelines for efficient product delivery. … like GitlabCI, Terraform/OpenTofu, Ansible, and scripting languages such as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud-native environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption More ❯
Are you a SiteReliabilityEngineer with experience in the iGaming and Gambling sector looking for an exciting new challenge? BENEFITS: Up to £95k depending on experience, fully remote, excellent benefits package. Join a rapidly growing company at the forefront of the iGaming industry, dedicated to delivering … leading brand, blending sports betting and online casino entertainment on a cutting-edge, custom-built platform. As a SiteReliabilityEngineer (SRE) , you will play a crucial role in designing, implementing, and maintaining scalable and reliable infrastructure. Working closely with development teams, you'll apply SRE principles … and resolving performance and availability issues. Manage and optimise containerised environments with Kubernetes , ensuring scalability and high availability. Collaborate with development teams to implement SRE best practices Implement strategies for Continuous Deployment to minimise release risks. Required Experience & Expertise Previous experience within the iGaming and Gambling sector. Strong experience with More ❯
The SiteReliabilityEngineer (SRE) will play a key role in maintaining and scaling infrastructure, ensuring reliability, performance, and scalability. You will collaborate closely with development, operations, and security teams to improve the reliability and efficiency of applications, addressing incidents, automating processes, and managing infrastructure … GCP, or Azure), automate provisioning using Terraform or CloudFormation, and manage resources for optimal performance. Monitor, troubleshoot, and resolve incidents, optimizing systems to ensure reliability and minimize downtime. Implement monitoring (Prometheus, Grafana, Datadog) and set up alerting systems to proactively address issues and ensure scalability. Work with DevOps, engineering … AWS Certified DevOps Engineer, Google Professional Cloud Architect, or similar. Containerization & Orchestration : Experience with Docker, Kubernetes, or ECS/EKS for containerized applications. SRE Experience : Familiarity with SRE principles like SLAs, SLOs, and error budgets, and practical application of those in large-scale systems. Distributed Systems : Understanding of microservices More ❯
up, we want like-minded humans to join us on this exciting journey. Are you ready? As a SiteReliabilityEngineer (SRE), you will play an important role in designing, building, and maintaining the infrastructure and tools necessary to support our software applications and services. You will … collaborate closely with the product engineering squads, technical operations, and security teams to ensure the reliability, scalability, and security of our platform. Your responsibilities will include automating infrastructure provisioning, configuration management, and deployment pipelines, utilizing best practices and modern technologies to streamline processes and improve efficiency. You will also … be responsible for monitoring system performance, identifying bottlenecks, and implementing solutions to enhance system reliability and performance. Key Responsibilities Cloud Platform Management: Using Azure/AWS to manage and optimize infrastructure components, ensuring scalability, reliability, and cost management. Infrastructure Design and Implementation: Designing, building and maintaining the cloud More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Sanderson Recruitment
Role: SiteReliabilityEngineer Location: London (Hybrid) Salary: £80,000 - £105,000 As our SiteReliabilityEngineer, you'll work closely with our feature team and other colleagues to meet defined service level objectives and continually improve systems and environments. You'll define error … Candidate: Very strong engineering skills in Java,JavaScript or Python Open Telemetry experience Must have Core Java/Python Must have experience as an SRE knowledge of Python Data Structures Strong knowledge of deploy and release services, automation and troubleshooting Experience of utilising tools and technology across the software development More ❯
Global SiteReliabilityEngineer Location: London About Us Founded in 2013, GSR is a leading market maker and programmatic trading firm in the fast-evolving world of cryptocurrency trading. With over 200 employees across seven countries, we provide billions of dollars in liquidity daily to cryptocurrency protocols … be deeply embedded in every major sector of the cryptocurrency ecosystem. About the Role We are seeking a SiteReliabilityEngineer (SRE) to design, optimize, and support highly available systems across our global trading infrastructure. As part of GSR's SRE team, you will manage a multi … infrastructure, including: Networking & Exchange Connectivity Linux Systems & Kubernetes Administration Microservice Orchestration & Observability Disaster Recovery & Security Optimization Your mission is to improve latency, scalability, and reliability, ensuring GSR remains a best-in-class market maker. We value engineers who drive automation, reduce friction, and enhance developer velocity through better tooling More ❯
SiteReliabilityEngineer with Python Our client is seeking a sitereliabilityengineer to deploy, manage, troubleshoot, and enhance complex cloud-based internal tools and externally managed services for a diverse organization. You should have at least 7 to 10 years of hands-on … experience as a SiteReliability Engineer. You will collaborate with IT, product, and engineering teams to maintain and improve these tools and services, troubleshooting issues as they arise. The ideal candidate will proactively identify system weaknesses and resolve them before causing production issues through monitoring and data analysis More ❯
Select how often (in days) to receive an alert: We are seeking a highly skilled and proactive Oracle SiteReliabilityEngineer (SRE) to ensure the reliability, performance, and scalability of our critical Oracle-based applications and services supporting a global user base. The ideal candidate will … possess deep expertise in Oracle technologies and SRE methodologies. You will be responsible for ensuring the stability and efficiency of our Oracle systems, implementing automation, managing patching, and providing expert-level support to our global users. Strong cross-functional collaboration and a proactive approach to problem-solving are essential for … troubleshoot IT systems. Leverage cloud platforms and automation tools to enhance scalability and efficiency. Ensure compliance with IT standards and regulations. Apply knowledge of SRE (SiteReliability Engineering) and/or DEVOPS practices to improve system reliability and performance. Maintain UK Security Clearance BPSS (Baseline Personnel Security More ❯
Job Description SiteReliabilityEngineer with Python Our Client looking to bring on a sitereliabilityengineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide … ranging organization. You will have at least 7 to 10 years hands-on expertise working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition More ❯
SiteReliabilityEngineer page is loaded SiteReliabilityEngineer Apply remote type Remote locations London, UK Manchester, UK Remote (United Kingdom) Belfast, UK time type Full time posted on Posted 27 Days Ago job requisition id R The Intapp Cloud Platform is a rapidly … You will work with Development and Product Management to design and deliver new functionality. You will perform deep dives into both systemic and latent reliability issues; partner with software engineers across the organization to produce and roll out fixes. You will drive standardization efforts across multiple disciplines and services … a solid understanding of continuous integration, deployment and operations concepts. You have production experience of managing Windows Infrastructure running IIS workloads. Passion for resolving reliability issues and identify strategies to mitigate going forward. Automation mindset - if you can automate it, do it. Fluency in English. What you'll gain More ❯
nurture others and learn from them, then this is your challenge! The Team The Infrastructure as a Service (IaaS) team aims at upholding the reliability and scalability we expect from Algolia's infrastructure for its critical systems and products. Our focus is on enabling teams across Algolia to leverage … this infrastructure while keeping it under control through an always increasing level of automation. The Opportunity The Senior SiteReliabilityEngineer position within the IaaS team provides a dynamic opportunity for a professional with foundational experience in maintaining and optimising scalable infrastructures. This role specifically concentrates … on three key areas: Server and container hosting, cloud and network expertise and flawless observability. As a Senior SiteReliabilityEngineer (SRE) , you will play a pivotal role in designing, implementing, and maintaining highly available, scalable, and fault-tolerant systems. Your work will directly impact the effectiveness More ❯
SiteReliabilityEngineer - DevOps Engineer 18 Month Contract PAYE - Fully Remote/or Hybrid based in Midlands if preferred. The role We are working with one of the finest gaming studios in the industry and are on the lookout for an … exceptional SiteReliabilityEngineer who can bring their expertise and unique thinking to help make their team even stronger! As an SRE , the main purpose is solving for scale through collaboration and automation, bringing engineering principles to infrastructure and operational problems. Work closely with the different teams More ❯
can make a meaningful impact. See more about our culture on . Role Summary We are seeking highly experienced SiteReliability Engineers (SRE) to shape the reliability, scalability and performance of our platform and customer facing applications. You will work closely with our software engineers and research … teams to ensure our systems meet and exceed our internal and external customers' expectations. What you will do As a SiteReliabilityEngineer, you balance the day-to-day operations on production systems with long-term software engineering improvements to reduce operational toil and foster the reliability … articles and conferences. About you Master's degree in Computer Science, Engineering or a related field. 7+ years of experience in a DevOps/SRE role. Strong experience with cloud computing and highly available distributed systems. Exposure to sitereliability issues in critical environments (issue root cause analysis More ❯
SiteReliabilityEngineer (SRE) - Payments London, England, United Kingdom Software and Services Description SRE and Engineering Operations Engineers in the team take part in every aspect of the software development lifecycle. We work in a fast-paced environment and are responsible for hands-on coding of critical … analytical thinking skills. Experience building systems both on-premise (data center) and on public cloud (AWS, GCP, or Azure welcome). Understanding of core SRE concepts - Monitoring, Alerting, Incident management. Preferred Qualifications Expertise with container platforms (e.g. Docker, or similar). Experience in presenting complex technical concepts to both technical More ❯
matters. We are in it for the long term, come join us on this journey. As a Senior SiteReliabilityEngineer (SRE), you'll be joining a team whose mission is to ensure the availability, performance, security and reliability of our platform and core services, ensuring … monitoring of those systems, for building tooling and automation to reduce TOIL and for responding to incidents as part of our 24/7 SRE on-call team. Reliability Engineering at Board Intelligence The SRE team: Strives to provide the highest standards of Availability, Scalability, Performance and Security for … and responds to incidents as part of a 24/7 rota Key responsibilities of the role We're looking for a great Senior SRE to be a hands on individual contributor to key technical projects and to help us build a first-class SRE function. This role will involve More ❯
part of a team operating global services, handling the requests of hundreds of millions of Apple customers. This kind of scale presents unique challenges. SRE teams at Apple support the full infrastructure stack; from individual API performance to network traffic management. Responsibilities will be both broad and deep. SRE teams … of Apple's payment services including Apple Pay. Good ideas are heard and results are rewarded. As a valued member in our Wallets & Payments SRE team, you'll be on a team whose mission is to build and continuously improve Apple's most critical payment platform services. Do you like … analytical problem solving and analytical thinking skills. Ability to clearly and accurately communicate day-to-day operations to ensure detailed hand-off to regional SRE teams. Preferred Qualifications Prior experience in supporting large-scale banking or payment systems, is a plus. Expertise in API design and interface technologies. Expertise with More ❯
SiteReliabilityEngineer (SRE) - Data Platform London, England, United Kingdom Software and Services At Apple, we believe that innovation flourishes in an environment where ideas are challenged, collaboration is encouraged, and technology is pushed to its limits. This environment is only possible when diverse minds come together … innovation in everything we do. Imagine what you could accomplish here! Join Apple and help us make the world a better place. As an SRE on our team, you'll be responsible for architecting, optimizing, and scaling distributed storage and analytics systems. You'll collaborate closely with development teams to … hybrid cloud environment. Preferred Qualifications Knowledge of provisioning, data migration, disaster recovery, and capacity planning. Experience in automating repetitive tasks and processes to enhance reliability and efficiency. Good understanding of networking concepts, including TCP/IP stack, DNS, DHCP, and other standard network protocols. Contribution to team and organizational More ❯
SiteReliabilityEngineer Remote - Canada, Americas/Engineering We offer The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or … like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliabilityEngineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways … be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing More ❯
Location: Hybrid - 20% in the office per month Nominet is on the hunt for a skilled SiteReliabilityEngineer to be a part of our Reliability Engineering function. This team is dedicated to the creation and upkeep of our secure compute platforms, a foundational element of … unwavering commitment to strict security and compliance protocols, you can expect to encounter a myriad of challenging problems to address. The role of the SiteReliabilityEngineer is vital to Nominet; this role encompasses the design, rollout, and administration of scalable cloud infrastructure, primarily on AWS. The … or tools to further enhance automation, orchestration, and developer experience. About you and your experience Technical A suitable candidate would ideally have experience in SRE, platform engineering, DevOps, or a cloud engineering role. AWS: Experience operating production systems on AWS. Holding relevant AWS certifications (like AWS Certified Solutions Architect or More ❯
SiteReliabilityEngineer | Inside IR35 | Hybrid - 2 Days Onsite London | 6 Month Contract Our client a multinational and respected consultancy is hiring for a Lead SiteReliabilityEngineer with expertise in AWS and DevOps Tools for a new project in the Public Sector. Technical More ❯
london, south east england, United Kingdom Hybrid / WFH Options
RP International
SiteReliabilityEngineer | Inside IR35 | Hybrid - 2 Days Onsite London | 6 Month Contract Our client a multinational and respected consultancy is hiring for a Lead SiteReliabilityEngineer with expertise in AWS and DevOps Tools for a new project in the Public Sector. Technical More ❯