Sheffield, England, United Kingdom Hybrid / WFH Options
KnowBe4, Inc
Snr. SiteReliabilityEngineer (Remote position located in Leeds/Sheffield, United Kingdom) Sheffield, United Kingdom About KnowBe4 KnowBe4, the provider of the world's largest security awareness training and simulated phishing platform, is used by tens of thousands of organizations around the globe. KnowBe4 enables organizations to manage the ongoing problem of social engineering by helping … person, we strive to make every day fun and engaging; from team lunches to trivia competitions to local outings, there is always something exciting happening at KnowBe4. KnowBe4’s SiteReliability Engineers help ensure that our platforms are reliable, secure, scalable, and efficient. They work alongside other engineers in a fast-paced, agile development environment, and share solutions … to advance the technologies running our systems, improve their safety and reliability, and make the complex distributed services that deliver our platforms easy to understand. The ideal member of our team gets excited about new AWS service releases, stays up-to-date on industry trends and design patterns, and has excellent time-management and communication skills. Some of the More ❯
SiteReliability Engineering within Documents & Biometrics is responsible for ensuring GBG delivers a world-class experience for all our customers and team members globally. The SiteReliability Engineering Team is a 2nd line technical function, providing a gateway service between 1st line Customer Support and Technology 3rd line Engineering for supported products and services consumed by … identify, prevent and resolve customer impacting issues while providing a feedback loop to engineering to ensure continual service improvement for an outstanding customer service experience. The role of the SiteReliability Engineering Team Lead is to provide people management and technical leadership for GBG Documents & Biometrics production systems 24/7/365 (outside of core working hours … service performance for customer user journeys, as well as responding to system events, trends, and alerts. You will be responsible for the work activity, skills, and capability of the SiteReliability Engineering team. What you will do As a people manager, you will be responsible for performance management and development of your team and resourcing where required. Take More ❯
Job Description Insight Global is seeking an Operations SiteReliabilityEngineer to provide global operational support for a leading infrastructure software company … s customer-facing SaaS products. You will join a team of engineers demonstrating exceptional technical expertise, managing mission-critical infrastructure, and ensuring optimal availability (24x7x365), performance, and security. This SRE role involves monitoring, maintaining, and enhancing the availability and performance of production services. Responsibilities include driving automation to minimize failures and manual tasks, supporting stakeholder requests within agreed SLAs, and More ❯
Insight Global is looking for an Operations SiteReliabilityEngineer to help with global operational support for a leading infrastructure software product company’s customer-facing SaaS products. You … will be part of a team of engineers that demonstrates superb technical competency, operates mission-critical infrastructure and ensures the highest levels of availability (24x7x365), performance and security. This SRE would be part of the critical operations function that is responsible for the monitoring, availability and performance of production services. They would be driving automation to reduce failures, manual tasks More ❯
Join Barclays as a Senior SiteReliabilityEngineer and become part of our newly formed Core SRE Team. This team will establish a Centre of Excellence to enhance and promote SRE best practices across GTIS. As a key hire, you will play a pivotal role in raising awareness and driving adoption of SRE methodologies within various GTIS … across GTIS and CTO, engaging with storage, data, and other product teams. You will act as a trusted advisor, providing strategic guidance and consultative support to help teams improve reliability, scalability, and efficiency. To be successful in this role you should have: Proficiency in Programming and Scripting - This includes expertise in languages such as Python, Powershell, or Go, which … reliability at scale. Influential Communication Skills - The ability to communicate effectively with team members and stakeholders, ensuring alignment, inspiring and motivating them to embrace new mindsets, cultures, and SRE working practices. This skill is crucial for driving meaningful change and fostering a collaborative environment where innovative ideas can thrive. Some other highly valued skills include: Knowledge of Cloud Computing More ❯
The SiteReliability Engineering (SRE) team at Pendo is responsible for provisioning and maintaining cloud infrastructure from development through production for all product initiatives, and working with developers and product managers to ensure that our products are not only reliable and performant, but also cost-efficient. Our platform is built on Google Kubernetes Engine (GKE) and utilizes several … In production, SREs perform Tier 1 on-call and incident management functions, supporting a high-throughput platform which processes more than 15 billion events per day. To ensure the reliability of this environment for our customers, SREs work closely with developers and product managers to understand service level objectives, think through failures scenarios, and design systems which balance cost … with reliability objectives. Additionally, SREs collaborate with the Information Security team to ensure that cloud infrastructure is properly secured, and that sufficient controls are in place to meet our compliance goals with respect to industry standards such as SOC 2. Role Responsibilities Write high-quality infrastructure-as-code that automates the provisioning, deployment, scaling, and monitoring of Pendo's More ❯
A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering … expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. … user demands and enhance overall service performance. This role is eligible for inclusion in the Company's hybrid working from home policy. Preferred skills and experience Excellent knowledge of SiteReliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
bet365
bet365 Stoke-On-Trent, England, United Kingdom SiteReliabilityEngineer bet365 Stoke-On-Trent, England, United Kingdom Direct message the job poster from bet365 A SiteReliabilityEngineer who will enhance system reliability, observability, and performance through a strong engineering approach, and assist with incident resolution and best practices. You will have software … engineering skills, focusing on system reliability and observability. You will monitor the health, performance, and availability of critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices, and develop features for maintainability. You will also help engineer tools … and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development lifecycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands and enhance overall service More ❯
Stafford, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of … critical systems, directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and … user demands and enhance overall service performance. This role is eligible for inclusion in the Company’s hybrid working from home policy. Preferred skills and experience Excellent knowledge of SiteReliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
bet365
Direct message the job poster from bet365 A SiteReliabilityEngineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems … directly impacting operational efficiency. Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best … user demands and enhance overall service performance. This role is eligible for inclusion in the Company’s hybrid working from home policy. Preferred skills and experience Excellent knowledge of SiteReliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary More ❯
Stratford-upon-Avon, England, United Kingdom Hybrid / WFH Options
NFU Mutual
exciting new tooling to support project delivery Hybrid up to 80% homeworking available and 20% in Stratford-upon-Avon About the role We have an exciting opportunity for a SiteReliabilityEngineer (Observability) to join our Monitoring Team within IT Infrastructure Products on a 12-month fixed term contract. You’ll play a key role in the … with the business and 3 rd party suppliers to ensure that our portfolio of applications and infrastructure services are operating effectively and deliver value through proactive monitoring. As aSite ReliabilityEngineer, you’ll be responsible for deploying new monitoring tooling across our IT estate to support high-profile change activities, understanding the key business processes, and using your … business teams as an authority on observability and will work with our architecture and design teams to incorporate an effective monitoring strategy into new solutions. About you As aSite ReliabilityEngineer, you’ll use your knowledge and experience of observability and IT Event Management to deliver excellent results across a varied IT estate. You’ll have knowledge of More ❯
along the way! Job Summary We have built Curve Dental into an industry-leading provider of beautiful cloud software for the dental industry. Who We're Looking For Our SiteReliability Engineers (SREs) are passionate about automation and its power to streamline the deployment and operation of software. They collaborate closely with developers to support a wide range More ❯
is to be an effective enabler of Capital One's ambitions. We are keen to add a Senior SiteReliability Engineering Manager (SSREM) to our Nottingham based SRE organisation whose primary focus is to provide effective leadership as we evolve and mature sitereliability practices for the benefit of our cloud applications and their customers. The … successful candidate will be a leader of leaders with custodianship of application services across 5+ SRE teams. We're looking for an experienced professional whose technical background allows effective challenge and support of teams managing primarily Java based applications running in a dynamic IaaC AWS cloud environment. A proven ability to lead, inspire, include, empower, coach and develop their teams … to deliver challenging outcomes in the pursuit of business, functional and personal goals. The successful application will lead by example, build strong and valuable relationships within the SRE org, wider tech and business stakeholders. They have the ability to face ambiguity and understand how to make sense of complexity, importantly being able to communicate this to varying levels of seniority More ❯
Join us as an AVP DevOps & SREEngineer within the Collateral Management Platform project, where you will architect, implement, and maintain CI/CD pipelines while ensuring system reliability, scalability, and security across our microservices ecosystem. This role offers the opportunity to work at the intersection of cutting-edge technology and financial services, building a modern collateral management … excellence. Staying informed of industry trends and contributing to technology communities. Adhering to secure coding practices and implementing effective unit testing. Qualifications and skills: Experience as a DevOps/SREEngineer, with knowledge of GitLab pipelines, Docker, containerization, and Chef. Valued skills include C#, SQL, PowerShell, and Python. This role is based out of our Glasgow Campus. Accountabilities include More ❯
South West London, London, United Kingdom Hybrid / WFH Options
Halian Technology Limited
Senior SiteReliabilityEngineer in the UK UP TO £150K fully remote Location:Remote … Type:Permanent, Full-Time Are you among thetop 1% of SiteReliability Engineersin the UK? Our client an IT Service Management company is building a world-class SRE team to support amission-critical Java-based platformused by millions. If youre ahands-on engineerwith a background inLinux systems, deepAWS expertise, and a passion forincident response, reliability, and scale … engineers and developers to support aJava-based product Operate in amanual, tool-light environmentwhile helping us scale and automate ? What Were Looking For: 712 yearsof experience, with5+ years in SRE roles StrongLinux/System Adminfoundation Proven experience inlive incident troubleshootingand root cause analysis Deep AWS knowledge you can speak to how youve used services likeEKS, EC2, Load Balancersin production Experience More ❯
Join us as a Senior SiteReliabilityEngineer - Oracle where you'll spearhead the evolution of our digital landscape, driving innovation and excellence. This role will include: applying software engineering techniques, automation, and best practices in incident response, ensuring the reliability, availability, and scalability of the systems, platforms, and technology through them To be successful as … a Senior SiteReliabilityEngineer - Oracle you should have experience with: Oracle Enterprise manager (OEM), Oracle Internet Directory (OID),Oracle database Performance Tuning – SME SQL Optimization Strong skills in scripting languages (e.g., Python, Bash) to automate repetitive tasks and knowledge of configuration management tools (e.g., Expertise in setting up and maintaining monitoring systems (e.g., AWS, Azure, Google … acumen strategic thinking and digital and technology, as well as job-specific technical skills To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolution, analysis and More ❯
Social network you want to login/join with: SRE/DevOps Engineer – High Frequency Trading - Multi Strategy Hedge Fund - Multi Billion Dollar Hedge Fund - Multiple Headcount - Open to Relocation - Up to £700k TC, slough col-narrow-left Client: Location: slough, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 2 Posted: 06.06.2025 … Expiry Date: 21.07.2025 col-wide Job Description: SRE/DevOps Engineer – High Frequency Trading - Multi Strategy Hedge Fund - Multi Billion Dollar Hedge Fund - Multiple Headcount - Open to Relocation - Up to £700k TC Join a leading multi-strategy hedge fund, where you’ll collaborate with elite engineers and top investment professionals to develop cutting-edge trading technology. We are seeking … highly skilled SRE/DevOps Engineers with 5+ years of experience to join a leading multi-strategy hedge fund. This role offers an opportunity to work in a fast-paced, technology-driven trading environment where you will collaborate with elite engineers and top investment professionals to develop cutting-edge trading technology. Key Responsibilities: Drive the transformation of trading, research, and More ❯
SiteReliability Engineering (SRE) Lead Salary: National circa £65,000/London circa £75,000 Are you passionate about infrastructure automation and engineering? Do you have experience of building reliable scalable systems, or previous knowledge of Service Reliability Engineering? Do you have experience working hands-on with automation approaches and tools in an infrastructure engineering or operations … We are looking for passionate Cloud engineering professional to join our diverse and growing team to shape and actively contribute to the future of Cloud Service Desk and Service Reliability engineering teams in Aviva. A bit about the job: Manage BAU tasks to support Cloud Hosting Platform Services in AWS and Microsoft Azure, including incident and problem management, patching More ❯
SiteReliability Engineering (SRE) Lead Salary: National circa £65,000/London circa £75,000 Are you passionate about infrastructure automation and engineering? Do you have experience of building reliable scalable systems, or previous knowledge of Service Reliability Engineering? Do you have experience working hands-on with automation approaches and tools in an infrastructure engineering or operations … We are looking for passionate Cloud engineering professional to join our diverse and growing team to shape and actively contribute to the future of Cloud Service Desk and Service Reliability engineering teams in Aviva. A bit about the job: Manage BAU tasks to support Cloud Hosting Platform Services in AWS and Microsoft Azure, including incident and problem management, patching More ❯
SiteReliability Engineering (SRE) Lead Salary: National circa £65,000/London circa £75,000 Are you passionate about infrastructure automation and engineering? Do you have experience of building reliable scalable systems, or previous knowledge of Service Reliability Engineering? Do you have experience working hands-on with automation approaches and tools in an infrastructure engineering or operations … We are looking for passionate Cloud engineering professional to join our diverse and growing team to shape and actively contribute to the future of Cloud Service Desk and Service Reliability engineering teams in Aviva. A bit about the job: Manage BAU tasks to support Cloud Hosting Platform Services in AWS and Microsoft Azure, including incident and problem management, patching More ❯
SiteReliability Engineering (SRE) Lead Salary: National circa £65,000/London circa £75,000 Are you passionate about infrastructure automation and engineering? Do you have experience of building reliable scalable systems, or previous knowledge of Service Reliability Engineering? Do you have experience working hands-on with automation approaches and tools in an infrastructure engineering or operations … We are looking for passionate Cloud engineering professional to join our diverse and growing team to shape and actively contribute to the future of Cloud Service Desk and Service Reliability engineering teams in Aviva. A bit about the job: Manage BAU tasks to support Cloud Hosting Platform Services in AWS and Microsoft Azure, including incident and problem management, patching More ❯
Social network you want to login/join with: Sr. SiteReliabilityEngineer (Kubernetes), slough col-narrow-left Client: VeeAR Projects Inc. Location: slough, United Kingdom Job Category: Other - EU work permit required: Yes col-narrow-right Job Views: 4 Posted: 31.05.2025 Expiry Date: 15.07.2025 col-wide Job Description: Minimum Qualifications: • 7+ years experience in building, operating More ❯
cloud environments Reliability Engineering: Lead initiatives to improve system reliability, establish SLOs, and implement monitoring and alerting strategies Team Leadership: Build, mentor, and grow a high-performing SRE team while fostering a culture of innovation and continuous improvement Incident Management: Establish and optimize incident response processes, lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation … operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacity planning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps aligned with business goals and scaling requirements Security Integration: Ensure security best practices are embedded in infrastructure … service providers Operational Excellence: Drive continuous improvement in operational processes, tooling, and methodologies What You Bring To The Role Technical Leadership Experience: 5+ years of experience leading and managing SRE/DevOps teams, with a proven track record of improving system reliability and performance Architectural Vision: Deep understanding of distributed systems, cloud platforms (AWS/GCP/Azure), and More ❯
Social network you want to login/join with: SiteReliability Engineering Team Lead, Crawley, West Sussex Client: Cornucopia IT Resourcing Location: Crawley, West Sussex, United Kingdom Job Category: Other EU work permit required: Yes Job Views: 4 Posted: 14.06.2025 Expiry Date: 29.07.2025 Job Description: SiteReliability Engineering Team Lead Our client is a leading provider … innovative, and reliable analytics software designed to meet the dynamic needs of organizations across various industries. They are looking for an experienced and driven SiteReliability Engineering (SRE) Lead to join their growing team. In this key leadership role, you will guide and support a team of SREs, elevate our infrastructure strategy, and champion operational excellence. You will … with AWS (EC2, ECS, RDS, S3, Fargate, IAM, VPC, etc.). Proficiency in infrastructure automation and configuration management, especially Ansible (Terraform experience is a plus). Strong understanding of SRE and DevOps principles, including CI/CD, IaC, and GitOps. #J-18808-Ljbffr More ❯
Job Description SiteReliability Engineering Team Lead Crawley/Hybrid £100,000+ Our client is a leading provider of cloud-based call and contact analytics solutions, helping businesses enhance their communications and improve operational efficiency. As a modern SaaS vendor, they deliver scalable … innovative, and reliable analytics software designed to meet the dynamic needs of organizations across various industries. They are looking for an experienced and driven SiteReliability Engineering (SRE) Lead to join their growing team. In this key leadership role, you will guide and support a team of SREs, elevate our infrastructure strategy, and champion operational excellence. You will … with AWS (EC2, ECS, RDS, S3, Fargate, IAM, VPC, etc.). Proficiency in infrastructure automation and configuration management, especially Ansible (Terraform experience is a plus). Strong understanding of SRE and DevOps principles, including CI/CD, IaC, and GitOps. #J-18808-Ljbffr More ❯