Has anyone actually ever given you a good description of what SRE is? Recently I've met dozens of companies implementing an SRE function. Half are just rebranding an ops team (because Ops ain't cool), some don't want to call the additional silo they have created 'DevOps' (because … apparently that's the wrong thing to do) so they're calling it SRE and the rest actually don't really know how to describe what they're doing. And if you can't describe it simply, you don't know what it is, chief (because Google do it, isn … We discussed Kubernetes, Prometheus and API Gateways. Most importantly, they spoke like they knew what the hell they were on about. Not just about SRE, but on the whole Engineering process. This is a company with $50million dollars funding, who are about to introduce a brand new monitisation model more »
Greater London, England, United Kingdom Hybrid / WFH Options
Apollo Solutions
Lead Cloud SiteReliability Platform Engineer London Hybrid - 2 days per week onsite Salary: Up to £120k Excellent Benefits and 20% Bonus My client Global Financial Client is looking for a Lead Cloud SiteReliability Platform Engineer to join their team to focus on keeping their … capacity management, backup and recovery etc. Ensuring the team is correctly skilled for the roles and identifying candidates to transition from Ops roles to SRE Must-Haves: Solid understanding of the SRE role and principles Experience working with a wide range of products in Azure and GCP, Kubernetes, container registries … working with several CI/CD and infrastructure as code-related tools such as Terraform, GitHub, Azure DevOps, Jenkins, Chef, etc. Experience leading an SRE or Operations team Negotiating skills to influence technical and leadership decisions to achieve the right consumer outcomes and operational needs A good understanding of public more »
Job DetailsDVF Recruitmenthttps://www.dvfrecruitment.comJob DescriptionWe are seeking a SiteReliability Engineer to join our SRE team based in Reigate. The ideal candidate will have excellent communication skills, experience working with multiple stakeholders, and a track record in Azure and Observability platforms.You will be joining at an … exciting time of transformation as we work on improving the delivery of value for customers and the business. You will be working in the SiteReliability and Response team, whose responsibility is to deliver and manage business critical services that are used 247 by our clients and colleagues … platforms such as DatadogProactive monitoring of production and other environments to ensure stability, availability, security and integrityCollaborate with cross-functional teams to ensure the reliability, availability, and performance of our client-facing servicesEngage with business stakeholders to gather requirements, address concerns, and provide updates on projects and system statusContribute more »
SiteReliability Engineer SRE Role in Cardiff once every 2 weeks - £60,000 Growing SaaS company seeks experienced SiteReliability Engineer to join the team. About the Company: Cardiff-based SaaS company having recently received significant government funding to accelerate their growth and product development. They … are now seeking an SRE to support the team help scale their reliable, high-performing cloud infrastructure. Key Responsibilities: Manage AWS infrastructure, implement monitoring with Prometheus & Grafana Automate provisioning and deployments using Terraform Ensure high reliability and performance of our cloud platform Requirements: Prior experience as SRE/DevOps … Python, Go or Bash Excellent problem-solving and collaboration abilities Experience in a SaaS/high-growth tech company Interview Process: Technical interview On-site interview to assess technical depth and problem-solving If you're passionate about building reliable, scalable systems, apply below or pop your CV across more »
What you will be doing:As a SiteReliabilityEngineering (SRE) Manager, you willTake ownership of your team, being responsible for current team members’ growth and development, plus hiring and onboarding new team membersCreate a positive environment where your team members thrive to deliver the best outcomes … accountability for tech decisions Use your specific experience working with Observability tooling and ecosystems to input into technical decision-makingWork with other stakeholders across engineering to ensure the systems and services your team provides meet the needs of your internal customersCollaborate, both within your team and across the tribe … working with engineers on their observability requirements, involving other stakeholder, including vendors where necessaryHave an understanding, and ideally experience, of some of the fundamental SRE concepts, such as toil, SLIs/SLOs and error budgetsHave experience of working with cloud native architectures (AWS and GCP are preferred)Have good communication more »
WANT A KEY ROLE WORKING WITH ONE OF MANCHESTER'S BEST? Job Title: SITERELIABILITY ENGINEER (Remote) Skills: Python/Terraform/AWS Location: Manchester City Centre, UK Work Pattern: Remote except for 1 day a month in Manchester centre Salary: £65-70k + additional benefits Summary … Oscar have partnered with one of the UK's largest agencies on their search for an experienced SiteReliability Manager. With the introduction of innovative technology, they have grown as a leading company for spearheading campaigns for household brands. This is a fantastic opportunity that comes with a … individual with experience implementing technology strategy to help develop tech environment management, CI efficiency and security practices in addition to championing DevOps culture and SRE principles. Sound like you…read on! Requirements CI/CD pipelines DevOps and SRE Principles Python AWS ecosystem Bonus: Knowledge of IAM policies and config. more »
working with a leading global travel company dedicated to providing exceptional experiences for travelers around the world. The SiteReliabilityEngineering (SRE) team plays a critical role in ensuring the reliability, performance, and scalability of their systems, enabling them to deliver best-in-class services to … performance of the services and applications. Manage end-to-end availability and performance, implementing automation to prevent and resolve issues. Work closely with other engineering teams to leverage existing frameworks and improve overall reliability. Drive the establishment and achievement of service-level objectives to maintain product reliability. Requirements: 4+ … skills and the ability to mentor team members. Passion for learning and staying updated on industry best practices. Nice to Haves: Experience as a SiteReliability Engineer or with high-availability systems. Background in production infrastructure and troubleshooting distributed systems. Familiarity with mobile development and distributed computing. What more »
Manchester, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A SiteReliabilityEngineering Team Leader who will help facilitate and drive activity and efforts of the team to deliver effective technical solutions to operational problems. The SiteReliability team works with several sections across the business, ensuring that our … Ansible. This role is eligible for inclusion in the Company’s hybrid working from home policy. Preferred skills and experience Commercial experience with leading engineering teams in the delivery of high quality technical solutions. Passion for working with technology and software development processes and practices. Strong experience working in more »
SiteReliability Engineer (Azure) London/Hybrid Up to £100,000 per annum Were looking for an experienced SiteReliability Engineer for our globally recognised Client, one of the largest global investors in the private equity secondary market. As the SiteReliability … Engineer you will ensure the reliability, performance, and availability of our critical systems and applications hosted on the Microsoft Azure Cloud platform. The SRE will work closely with development, operations, and other teams to design, implement, and maintain scalable and resilient infrastructure. Core Responsibilities: * Design, implement, and maintain highly … available and scalable infrastructure on Microsoft Azure Cloud. * Monitor, troubleshoot, and optimise the performance, capacity, and reliability of systems and applications. * Develop and implement automation tools and processes to improve operational efficiency and reduce manual intervention. * Collaborate with development teams to ensure the reliability and performance of applications more »
Services Team is looking for a Senior Manager of Cloud Data Platform Operations & Security. This leader would be responsible for the technical development of engineering staff within the cloud data & analytics practice and supporting technologies (eg. AWS, GCP, and OpenShift). The role is a hands-on, servant leader … and allocating Chapter members to best support the needs of Product and Value Stream Engineering teams using SiteReliabilityEngineering (SRE) practices, along with a disciplined approach to professional development of Chapter engineers. This role coaches the team to continuously improve their process and observability practices … least one technical implementation within own technical domain Ensure consistency of technical execution and knowledge across Products, sharing common practices and challenges within the engineering domain Engineer solutions for special projects as needed Internal Organization and engineering Leadership Develop own engineering Chapter into a highly technically competent more »
SiteReliability Engineer with Python Our Client looking to bring on a sitereliability engineer to help deploy, manage, troubleshoot, and enhance our complex cloud-based set of internal tools and externally managed services for a variety of users across our wide-ranging organization. You will … have at least 7 to 10 years hands-on expertise working as a SiteReliability Engineer. You will work closely with IT, product, and engineering to extend and maintain this set of tools and services and to help debug and resolve problems. In addition, the ideal candidate … Actively lead any critical issue post-mortem processes, including coordination of any meetings and further steps to take Qualifications 7+ years experience with software engineering, software development, and/or system operations Experience debugging complex problems and implementing timely cost-effective solutions Experience designing, building, and operating large-scale more »
millions of consumers build a brighter financial future and achieve yours along the way with a rewarding career. SiteReliabilityEngineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to IT infrastructure and operations. The main objectives … on availability, latency, performance, efficiency, change management, monitoring, emergency response and capacity planning of their services. As an Application SiteReliability Engineer (SRE) you will be part of team of people who are responsible for the availability of several of Discover's most critical applications: our PULSE network … online transaction processing using Connex Environmental Database (CED). Have knowledge of Processor Interface and foundational knowledge of Connex Advantage. 3+ years in a SRE or DevOps role Experience with DevOps tools, processes, and culture Extensive experience leading customer facing systems in a mission critical environment Experience with programming and more »
SiteReliability Engineer London (Hybrid 2 days a week on site Permanent The Background We are partnered with an innovative IT consultancy based in London but with a global presence who are leading advisors in their industry by creating lasting value for their clients. Due to growth … a flexible benefits fund. You… In order to be a successful SiteReliability Engineer you will have… Previous experience working as an SRE/at system administrator level In-depth knowledge of Windows Operating Systems and VMware with a good understanding of Linux Operating Systems In depth knowledge … VLAN’s, Routing, Switching) Security (Splunk, APM, SIEM) Login/Monitoring (Splunk, Elastic, Prometheus, PRTG, Netbox, IPAM, CMDB) Mattermost, Atlassian The role As a SiteReliability Engineer you will work on projects relating to application software, operating systems and system management tools as well as maintaining new and more »
and unique experience in an inclusive environment that helps them thrive. An exciting opportunity within financial services client is looking for DevOps Engineer/SRE/Sitereliability engineer based in London. Role : DevOps Engineer/SRE/Sitereliability engineer Location : London (2 days a more »
that will require weekly on-site work in Birmingham Your role as the Engineering Manager is to manage the running of the SRE Team within the Secure Development Unit. Managing the support team, business process oversight management and being responsible for the security, system reliability, availability and … service performance of a significant number of security related applications. We are an SRE team responsible for implementing, running and supporting a diverse range of security related tools which support the management of our core IT infrastructure. This includes tools which manage network and IT security, Physical Security and compliance … built on an ethos of continuous improvement; empowering the whole team to be actively engaged in improving our processes and our performance and driving SRE principles into all our Production system support. What you’ll be doing – your accountabilities Coordinates teams through the implementation of new software development life cycle more »
Chester, Cheshire, North West, United Kingdom Hybrid / WFH Options
Searchability (UK) Ltd
SiteReliability Engineer Role Description: An opportunity for an experienced sitereliability engineer to work for a globally recognised company in the heart of Chester on a hybrid working basis has arisen. You will join a team who are responsible for building a suite of observability … illness Use of a flex fund to use towards benefits Wellbeing helpline, mental health first aiders and virtual GP service Main Responsibilities of a SiteReliability Engineer: Maintain and enhance network monitoring, orchestration, and automation solutions, encompassing tasks such as inventory reconciliation, workflow automation, network configuration validation, health more »
Reading, England, United Kingdom Hybrid / WFH Options
Oracle
Senior SiteReliability Engineer – Software Assurance Team Reading (Hybrid 50%) Do you have a passion for high-scale services and working with some of Oracle's most critical customers? We are looking for a Senior SiteReliability Engineer that enjoys applying cutting-edge advances in technology … and enjoy exploring new technologies delivering robust, scalable solutions. Who are we? We are a world class team of high calibre security tool services SiteReliability Engineers. We are an inclusive and diverse team with a full spectrum of experience distributed globally. We have the resources of a … large enterprise and the energy of a start-up, working on a critical greenfield software assurance project collaboratively with our cloud and mobile engineering teams. The Software Assurance organisation has the mission to make application security and software assurance, at scale, a reality. We are a dedicated team, leveraging more »
SiteReliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CD A leading provider of financial services is seeking two SiteReliability Engineers- Leads with a solid and proven background in Azure or GCP. This position will also be based onsite in London … Will consider candidates from any of the key vendors across the Cloud- Azure, GCP, and AWS. Kubernetes & troubleshooting, managed services like AKS Using your SRE Attitude (understanding SLI, SLO & SLA) Container Image Management & Security like Aquasec Code Quality & repository Management like SonarQube & NexusQ Service Mesh (Istio) traffic shaping, canary, blue … Unit/Integration/Load Testing Azure Application Gateway & API Management Azure IAM - Identity & Access Management Azure Policy Management & Cloud Security Azure Express Route SiteReliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CD McGregor Boyall is an equal opportunity employer and do not more »
processing data at a scale comparable to Meta and Google! They are on the lookout for multiple count Senior SiteReliability Engineers (SRE) to join one of their incredibly talented teams. As a SiteReliability Engineer (SRE), you will play a crucial role in ensuring the … reliability, scalability, and performance of our systems and infrastructure. You will work closely with cross-functional teams to design, implement, and maintain robust and resilient systems, with a focus on automation, monitoring, and incident response. The role: • Working arrangements: Flexible – can be fully remote (UK residents only – unfortunately, Visa … support our core products and services. Develop and maintain automation tools and scripts for deployment, monitoring, and management of infrastructure components. Collaborate with software engineering teams to ensure that applications are designed with reliability, scalability, and performance in mind. Implement and maintain monitoring, alerting, and logging systems to more »
SiteReliability Engineer Currently seeking an experienced SRE to join a cloud team and support the deployment of global products. In this role, you'll be responsible for setting up production and internal environments, providing 24/7 first-line engineering support, and driving effective resolutions of … Maintaining and enhancing CI/CD tooling, Terraform scripts, and engineering operational documentation Providing expertise in building and maintaining product operational documentation and SRE practices Supporting security and compliance governance in production environments Collaborating closely with cross-functional teams across multiple regions Tech Stack: Terraform for Infrastructure as Code … platforms (GCP ideally or AWS) Programming languages: Python or Go To be a great fit for this role, you should have enterprise experience in SRE, DevOps, or production engineering, with proficiency in the above tech stack. Expertise in deploying and monitoring highly scalable products, and an understanding of SREmore »
Lead SiteReliability Engineer London Hybrid Salary: up to £118,000 + Package Gresham Hunt is currently partnered with a leading financial institution in their search for a Lead Site … Reliability Engineer. This is an exciting opportunity to lead a team of tech enthusiasts while diving deep into the world of DevOps and SRE in which you will orchestrate systems engineering, platform development, and ensure fault-tolerant, scalable operations. The successful candidate will: Experience with several of the … or Bash/Shell. Previous experience in a leadership role in technology. Background in senior engineering, ideally transitioning from software engineering to SRE/DevOps. Familiarity with Azure Cloud services like AKS, Key Vault, APIM, Application Gateway, and SQL. Strong DevOps skills, particularly in using integration tools such more »
roles where you can make a significant impact on the availability, performance, and efficiency of critical services? If you've previously excelled in an SRE or similar operations environment and are looking for your next challenge, we want to hear from you! These opportunities require you to work one day … Overview: As part of our clients dedicated Mortgages team, you'll be instrumental in working within a new SiteReliabilityEngineering (SRE) Function, focusing on enhancing system reliability across key areas such as availability, performance, latency, efficiency, capability, and incident response. This role is crucial as … risks effectively. What They're Looking For: Proven experience in software engineering with a strong background in Java or C#. Experience in an SRE function or similar operations environment, excluding purely DevOps, infrastructure, or deployment analyst roles. Familiarity with AWS, Kubernetes, and moving systems from data centers to cloud more »
Manchester, England, United Kingdom Hybrid / WFH Options
MRJ Recruitment
Calling all skilled SiteReliability Engineers!⚡️ We're on the hunt for a motivated and friendly individual to join our client's team as a SiteReliability Engineer! 💪You'll play a big part in maintaining the tech infrastructure of one of the top B2B brands … in line with industry standards Collaborate with others when products, features are being released and ensure they are built in line with brand standards SiteReliability Engineer – Requirements: Relevant commercial experience overseeing production systems Background programming and scripting using Python, Go, BASH etc. Comfortable using automation and IaC more »
Job descriptionSite Reliability Engineer - Team ManagerSRE-WTW_1675689117Site Reliability Engineer/Team Manager - Hybrid - Up to 110,000. I am working with an insurance and technology consultancy who provide data-driven insight-let solutions to their customers to help them become more resilient and get the best possible … taking in their best interests and understanding every need to give them the best outcome. You will be leading a group of well versed SiteReliability Engineers and play a big part in the success of their growing software as a service solutions they provide. You will be … to 20% bonus and plenty of other benefits to go along side. Skills and experience: * Public cloud, ideally Azure. * Leading and building technical teams. * SiteReliabilityEngineering or similar experience. * Architectural Design, Development or Cloud Security. Be the first to secure an interview as the slot will more »
SiteReliability Engineer - 3rd Level, Support, Degree, Cloud, Python, AWS, £competitive + benefits, hybrid/Cambridge My client is one of the most innovative software houses in the UK, with a bunch of dazzling awards behind them, they continue to take artificial intelligence to a whole new level. … at the heart of what they do, collaborating as one team and valuing peoples suggestions and ideas. We have an impressive opening for a SiteReliability Engineer who is looking to join an award winning software house in Cambridge. The type of person we are hoping to speak … and must enjoy operating in an ever changing landscape. - Degree educated from a leading 200 world prominent university possessing a 2.1 or better in engineering, physics, mathematics, computing, natural sciences or something similar but we are flexible on the subject studied. - Experience within a DevOps or SiteReliabilitymore »