excellence Develop and implement strategic plans to enhance the reliability, scalability, and efficiency of our infrastructure Collaborate with cross-functional teams to align SRE initiatives with broader organizational goals Establish and maintain SLIs, SLOs, and SLAs for critical systems and services Drive the adoption of best practices in automation … and management solution that helps organizations harness AI's potential while ensuring governance, security, compliance, and control. Experience Requirements: Proven experience in a senior SRE role or similar. Strong knowledge of cloud technologies and SLA SLO SLI management. Experience leading teams and implementing SCRUM processes. Excellent communication and leadership skills. … Experience line managing, mentoring, and coaching. Responsibilities: Collaborate with the Principal SRE to shape and implement the SRE strategic plan. Lead the SRE team in translating strategy into actionable plans, coordinating these through the SCRUM process. Address wellbeing and performance concerns, fostering a positive and productive team environment. Work with More ❯
our collective success? We seek a talented Engineering Lead to join our dynamic team and lead our SiteReliabilityEngineering (SRE) function. This role ensures our systems are reliable and scalable, directly impacting user satisfaction. By integrating SRE activities across teams, you'll foster collaboration and … alignment with senior management will keep us competitive and innovative, driving collective success. What will you do in this role? Oversee and manage the SREengineering team to ensure continuous improvement in reliability, scalability, ensuring conformance to our security standards. Lead the integration of SRE activities across Application … Computer Science, or a related field. Proven experience in a leadership role within an engineering team. Strong technical background with expertise in DevSecOps, SRE, Agile Excellent technical and organizational skills. Strong problem-solving abilities and attention to detail. What we prefer you to have: Effective communication and interpersonal skills. More ❯
Global Services Provider who are looking for a SiteReliability Engineer to join their team where you will manage and guide the SRE team. As a SiteReliability Engineer, you will be responsible for: Manage and guide the SiteReliabilityEngineering (SRE) team … while promoting SRE principles throughout FCA product groups. Serve as the SRE subject matter expert and strategic lead within the delivery organization. Oversee and advise on daily operations related to observability tools, including their upkeep and optimization. Provide hands-on support to engineering teams for delivering observability initiatives, as … the reliability of quality assurance results. Proven skills and experience to help you succeed in this role: Strong Experience with primary role of SRE Engineer Strong experience in Devops Tools (Git Hub, Git Hub Actions, Workflow, CodeQL Jenkins, Nexus, CloudFormation/Terraform etc.) Strong experience in monitoring tool (Datadog More ❯
Job Title: SiteReliabilityEngineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliabilityEngineering (SRE) Lead – Observability to join their team on a 6-month contract. This is a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
london, south east england, united kingdom Hybrid / WFH Options
MarkJames Search
Job Title: SiteReliabilityEngineering (SRE) Lead – Observability Location: Stratford, London (Hybrid – 2 days per week onsite) Contract Length: 6 months Rate: £450–£500 per day (Inside IR35) Industry: Financial Services A leading Financial Services organisation in London is seeking a SiteReliabilityEngineering (SRE) Lead – Observability to join their team on a 6-month contract. This is a hybrid role requiring two days per week onsite at their Stratford, London offices. The role sits Inside IR35 . Key Responsibilities: Lead the SRE Observability team and champion observability practices across multiple product groups. … creation and QA of project-level Observability Plans. Input into and assure the quality of testing strategies and results. Requirements Proven experience in an SRE role with a strong focus on Observability. Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive More ❯
site About this Opportunity Great opportunity for a Senior SiteReliability Engineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … What You'll Need Strong understanding of SiteReliabilityEngineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code … and CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical More ❯
SiteReliability Engineer London - 3 days in office mandatory Full-time permanent Up to £90,000 + 5 annual performance related bonus We have an exciting new opportunity for a SiteReliability Engineer to Join Robert Walters as a Consultant. As an Employed Consultant, you will … rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems! As SRE you will join a Global Investment Bank within the SRE Payments Technology Team in the Corporate & Investment Bank line of business, you will solve complex … and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform. Job responsibilities: * Guides and assists others in the areas of building appropriate level designs and gaining consensus More ❯
Preferred Qualifications: Master's degree in Computer Science or Engineering, or a related field. About the Job SiteReliabilityEngineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a More ❯
automate routine tasks. Systematic problem-solving approach, coupled with effective verbal and written communication skills. About the Job SiteReliabilityEngineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis, and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving, and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a More ❯
Select how often (in days) to receive an alert: We are seeking a highly skilled and proactive Oracle SiteReliability Engineer (SRE) to ensure the reliability, performance, and scalability of our critical Oracle-based applications and services supporting a global user base. The ideal candidate will possess … deep expertise in Oracle technologies and SRE methodologies. You will be responsible for ensuring the stability and efficiency of our Oracle systems, implementing automation, managing patching, and providing expert-level support to our global users. Strong cross-functional collaboration and a proactive approach to problem-solving are essential for success … troubleshoot IT systems. Leverage cloud platforms and automation tools to enhance scalability and efficiency. Ensure compliance with IT standards and regulations. Apply knowledge of SRE (SiteReliabilityEngineering) and/or DEVOPS practices to improve system reliability and performance. Maintain UK Security Clearance BPSS (Baseline Personnel More ❯
with purpose. About this opportunity Great opportunity for a Senior SiteReliability Engineer to join our Financial Wellbeing Platform. As a Senior SRE you'll be responsible for ensuring our products run reliably, are scalable, and perform optimally in production environments. You'll monitor and manage these aspects … What you'll need Strong understanding of SiteReliabilityEngineering with commercial experience in working in a relevant environment and putting SRE principles into practice. Stakeholder management experience and the ability to guide and consult engineering teams Strong DevOps understanding, including experience of Infrastructure as Code … and CI/CD pipelines, such as Terraform and Jenkins, or alternatives such as GCP Cloud SRE experience and broad set of relevant product knowledge Knowledge of SLAs, SLOs and SLIs is essential along with the best practices for defining and implementing them. Confidence and capability to communicate complex technical More ❯
can make a meaningful impact. See more about our culture on . Role Summary We are seeking highly experienced SiteReliability Engineers (SRE) to shape the reliability, scalability and performance of our platform and customer facing applications. You will work closely with our software engineers and research … teams to ensure our systems meet and exceed our internal and external customers' expectations. What you will do As a SiteReliability Engineer, you balance the day-to-day operations on production systems with long-term software engineering improvements to reduce operational toil and foster the reliability … and conferences. About you Master's degree in Computer Science, Engineering or a related field. 7+ years of experience in a DevOps/SRE role. Strong experience with cloud computing and highly available distributed systems. Exposure to sitereliability issues in critical environments (issue root cause analysis More ❯
The SRE Manager is responsible for leading the SiteReliabilityEngineering function across Europe, ensuring the reliability, scalability, and performance of critical infrastructure and services. This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader … team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a regional SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations More ❯
The SRE Manager is responsible for leading the SiteReliabilityEngineering function across Europe, ensuring the reliability, scalability, and performance of critical infrastructure and services. This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader … team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a regional SRE team focused on continuous improvement and innovation. Key Responsibilities: Technical Leadership Develop deep expertise in the Titanium trading platform to lead and support critical business … ensuring priorities align with business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations More ❯
Lead SRE - FinTech - £125K Our client is one of the world’s leading FinTech companies and are building out their SiteReliabilityEngineering (SRE) team in the UK. They’re looking to hire an experienced SRE to lead the team, grow it and drive engineering forwards. … The role will still be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge … able to define and drive technology strategies - You’ll have the chance to make the team your own! Requirements: Very strong technical experience of SiteReliability/DevOps Engineering Good experience of Java and/or Python development and scripting Very good experience of containers, monitoring, automation More ❯
Lead SRE - FinTech - £125K Our client is one of the world’s leading FinTech companies and are building out their SiteReliabilityEngineering (SRE) team in the UK. They’re looking to hire an experienced SRE to lead the team, grow it and drive engineering forwards. … The role will still be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge … able to define and drive technology strategies - You’ll have the chance to make the team your own! Requirements: Very strong technical experience of SiteReliability/DevOps Engineering Good experience of Java and/or Python development and scripting Very good experience of containers, monitoring, automation More ❯
Location: Hybrid - 20% in the office per month Nominet is on the hunt for a skilled SiteReliability Engineer to be a part of our ReliabilityEngineering function. This team is dedicated to the creation and upkeep of our secure compute platforms, a foundational element of … unwavering commitment to strict security and compliance protocols, you can expect to encounter a myriad of challenging problems to address. The role of the SiteReliability Engineer is vital to Nominet; this role encompasses the design, rollout, and administration of scalable cloud infrastructure, primarily on AWS. The chosen … or tools to further enhance automation, orchestration, and developer experience. About you and your experience Technical A suitable candidate would ideally have experience in SRE, platform engineering, DevOps, or a cloud engineering role. AWS: Experience operating production systems on AWS. Holding relevant AWS certifications (like AWS Certified Solutions More ❯
SRE Lead - FinTech - £120K+ Our client is one of the world’s leading FinTechs and are building out their SiteReliabilityEngineering (SRE) team in the UK. They’re looking to hire an experienced SRE to lead the team, grow it and drive platform engineering forwards. … The role will still be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge … able to define and drive technology strategies - You’ll have the chance to make the team your own! Requirements: Very strong technical experience of SiteReliability/DevOps Engineering Good experience of Java and/or Python development and scripting Very good experience of containers, monitoring, automation More ❯
SRE Lead - FinTech - £120K+ Our client is one of the world’s leading FinTechs and are building out their SiteReliabilityEngineering (SRE) team in the UK. They’re looking to hire an experienced SRE to lead the team, grow it and drive platform engineering forwards. … The role will still be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very high availability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge … able to define and drive technology strategies - You’ll have the chance to make the team your own! Requirements: Very strong technical experience of SiteReliability/DevOps Engineering Good experience of Java and/or Python development and scripting Very good experience of containers, monitoring, automation More ❯
what matters. We are in it for the long term, come join us on this journey. As a Senior SiteReliability Engineer (SRE), you'll be joining a team whose mission is to ensure the availability, performance, security and reliability of our platform and core services, ensuring … monitoring of those systems, for building tooling and automation to reduce TOIL and for responding to incidents as part of our 24/7 SRE on-call team. ReliabilityEngineering at Board Intelligence The SRE team: Strives to provide the highest standards of Availability, Scalability, Performance and Security … and responds to incidents as part of a 24/7 rota Key responsibilities of the role We're looking for a great Senior SRE to be a hands on individual contributor to key technical projects and to help us build a first-class SRE function. This role will involve More ❯
SiteReliability Engineer (SRE) Hybrid working Who are we? Toyota Connected Europe aims to create a better world through connected mobility for all. We are a new company focused on integrating big data and a customer-centric approach into all aspects of the mobility experience to make it … a start-up culture where every member acts like an owner, with immediate impact and visibility of their work. About the role: Our Cloud Engineering team plays a crucial role in Toyota Connected Europe's success by providing the necessary tools and processes for global growth and scalability. We … aim to enhance agility, effectiveness, and innovation, collaborating with product teams to align on technological and project goals. As a SiteReliability Engineer, you will manage and improve complex cloud operations for one of the world's largest automotive companies. You will work in a fast-paced, innovative More ❯
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliabilityEngineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
IT solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliabilityEngineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and More ❯
Job Description SiteReliability Engineer with Python Our Client looking to bring on a sitereliability engineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide-ranging organization. … You will have at least 7 to 10 years hands-on expertise working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition, the … Actively lead any critical issue post-mortem processes, including coordination of any meetings and further steps to take Qualifications • 7+ years experience with software engineering, software development, and/or system operations • Experience debugging complex problems and implementing timely cost-effective solutions • Experience designing, building, and operating large-scale More ❯
SiteReliability Engineer Remote - Canada, Americas/Engineering We offer The Tyk API Management platform is helping to drive the connected world and power new products and services. We're changing the way that organisations connect any number of their systems and services.Whether internal, external, public or … like an environment that you believe could work for you then read on to find out more. The role: We're looking for a SiteReliability Engineer to manage, maintain, improve and provide support on our platform. You will be curious by nature, always looking for ways to … be advocate of continuous improvement Reliability of our new global Tyk Cloud platform Automation of operations and support Writing and maintaining documentation on SRE processes and policies Recommending and implementing ways of driving operational efficiency and driving down our cost to run, without impacting service Assisting in penetration testing More ❯