Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Twinstream Limited
SiteReliabilityEngineer | £65,000–£95,000 DOE | Hybrid (Bristol-based, occasional site visits) Clearance: Must be eligible for DV Clearance Founded in 2019 by engineers solving complex cross-domain problems for government organisations, TwinStream delivers technical excellence and exceptional service to high-profile clients. Our teams work both on-site and remotely … supporting mission-critical systems where performance and reliability are paramount. The SiteReliabilityEngineer Role: We are seeking a SiteReliabilityEngineer (SRE) to ensure the availability, performance, and cost-effectiveness of our cloud and on-prem services. You will collaborate with software engineers and system administrators to improve observability, reduce downtime, and … proactively mitigate reliability risks across a growing portfolio of services. Key Responsibilities of the SiteReliabilityEngineer: Improve reliability and performance across multiple subsystems. Automate manual tasks and eliminate unnecessary alerts. Enhance monitoring capabilities to identify and resolve issues before they impact users. Support and optimise CI/CD pipelines and cloud infrastructure. Research and More ❯
Farnborough, Hampshire, England, United Kingdom Hybrid / WFH Options
Addition
SiteReliabilityEngineer (Defence) This is a chance to join a forward-thinking digital solutions business delivering secure technology for the Defence and Security sector. As a SiteReliabilityEngineer, you’ll be at the heart of building, scaling, and maintaining critical platforms that underpin mission-ready technology. Role Overview: Role: SiteReliabilityEngineer Location: Hybrid, 3 days per week in Farnborough Package: £60,000- £70,000 per annum Benefits Industry: Defence & Security What You’ll Be Doing: Designing and maintaining Kubernetes environments for scalable deployments. Building and optimising CI/CD pipelines to improve efficiency. Implementing monitoring systems to ensure reliability and performance. Driving automation initiatives to reduce manual … in security, maintainability, and scalability. Staying ahead of emerging technologies to keep the platform cutting-edge. Main Skills Needed: Applications must be eligible for Security Clearance. Proven experience in SiteReliability or Platform Engineering (5+ years). Strong knowledge of Kubernetes and container orchestration. Expertise in CI/CD tools (Jenkins, GitLab, etc.). Experience with AWS is More ❯
Washington, Washington DC, United States Hybrid / WFH Options
ClearanceJobs
Remote - SiteReliabilityEngineer (SRE) ClearanceJobs is aiding their partner, headquartered in New York City and widely recognized as the industry leader in CPS protection, in their search for a skilled SiteReliabilityEngineer (SRE). The selected candidate will support and maintain our customers' FedRAMP- compliant deployment in AWS GovCloud for public sector … customers. The SRE will be responsible for ensuring high availability, security, and compliance of cloud-based environments while driving automation, monitoring, and incident response best practices. U.S. Citizenship (required for working in GovCloud environments) Terms: Fulltime/Direct Hire Location: Remote (DMV area) Salary: $200k - $260k (will fluctuate pending experience) Qualifications: • 6-8+ years of experience in SRE, DevOps … and scripting (Python, Bash). • Experience with logging, monitoring, and observability tools in a cloud-native environment. • Strong troubleshooting, problem-solving, and automation mindset. Responsibilities/Impact as a SRE: • AWS GovCloud Operations: Manage and optimize cloud-based infrastructure in AWS GovCloud, ensuring FedRAMP compliance and high availability. • Reliability & Performance: Monitor and enhance system performance, scalability, and reliabilityMore ❯
Role Overview: We are seeking a highly skilled and motivated SiteReliabilityEngineer (SRE) to join our engineering team to support critical application deployments in a "follow-the-sun" environment. In this role, you will leverage your expertise in cloud provisioning, infrastructure as code, and container orchestration to ensure the reliability, scalability, and performance of our … and versioning. Containerization and Orchestration: Deploy, manage, and provide ongoing support for containerized applications using Kubernetes, including Amazon EKS (Elastic Kubernetes Service) and Azure Kubernetes Service (AKS), ensuring their reliability, availability, and performance. Monitoring and Alerting: Monitor application performance and system health through observability tools (e.g., Prometheus, Grafana, ELK stack), proactively identifying and resolving issues to ensure high availability … and solutions, including RESTful APIs, ensuring seamless integration across platforms. Post-Mortem Analysis: Conduct comprehensive post-mortem analyses following incidents, identifying root causes and recommending improvements to enhance system reliability and performance. Mentorship: Mentor and guide junior engineers, fostering a culture of knowledge sharing and continuous improvement within the engineering team. Skills and Experience: Bachelor's degree in computer More ❯
Lisburn, County Antrim, United Kingdom Hybrid / WFH Options
Camlin
problems. As of today, the Camlin operation spans over 20 countries across the globe. Job Overview We are seeking a dedicated and experienced SiteReliabilityEngineer (SRE) to join our dynamic team. The SRE will be responsible for ensuring the reliability, performance, and availability of our critical systems and services. This role requires a blend of … software engineering and operations skills to build and run large-scale, distributed, fault-tolerant systems. Key Responsibilities System Reliability and Performance Design, implement, and maintain scalable and reliable infrastructure. Monitor system performance, detect issues, and ensure maximum uptime. Develop and implement strategies for disaster recovery and data backup. Automation and Tooling Automate repetitive tasks to improve efficiency and reduce … Conduct post-incident reviews to identify root causes and prevent recurrence. Develop and maintain incident response protocols and playbooks. Collaboration and Communication Work closely with development teams to integrate reliability into the software development lifecycle. Communicate effectively with stakeholders about system status and health. Provide guidance and mentorship to junior team members. Security and Compliance Ensure systems comply with More ❯
Overview SiteReliabilityEngineer - Global Network Services Transformation A leading financial technology organisation is embarking on an exciting journey to transform its Network Services Group , and they're now seeking a SiteReliabilityEngineer to join their growing team. This opportunity is perfect for someone who thrives at the intersection of software engineering and … infrastructure reliability . The successful candidate will design, develop, and maintain self-service automation tools that drive efficiency, reduce costs, and improve resilience across one of the world's most sophisticated network infrastructures. Working with colleagues across the US, UK, India, and Singapore , this engineer will play a pivotal role in advancing the company's automation-first approach More ❯
Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Commify Group
and be part of our success story! Role Summary In the role of SiteReliabilityEngineer at Commify, you will be an integral part of our SRE team. Your focus will be on ensuring that our products and platforms perform at their best, understanding how our software interacts with both physical and Cloud infrastructure to deliver exceptional … What essentials are we looking for? Proficiency with Microsoft Azure Strong expertise in Terraform, App Services, and Kubernetes Fluent in both written and spoken English A genuine passion for reliability in systems Experience in creating and modifying Terraform deployments Prior experience in an operations role, ideally as a SiteReliabilityEngineer Ability to work cross-functionally More ❯
Washington, Washington DC, United States Hybrid / WFH Options
OMW Consulting
Job Title: SiteReliabilityEngineer (SRE) Location: Washington, DC - Hybrid Clearance: TS/SCI Salary: $160k-$200k Join a dynamic team dedicated to delivering best-in-class service quality and issue resolution for mission-critical deployments. In this role, you will be instrumental in shaping operational policies and implementations while working in both on-premise DoD environments … various OSI model layers to meet SLAs. Collaborate with developers to maintain secure and efficient workflows. What We're Looking For: Minimum of 4 years of experience as an SREengineer, with a strong focus on automation and deployment. Active security clearance with experience in DoD IT environments. Proficiency in VMware, Kubernetes, Docker, Helm, Ansible, and Terraform. Strong understanding More ❯
Reigate, Surrey, England, United Kingdom Hybrid / WFH Options
esure Group
Engineer to join our Tech Enable team. As a Lead Engineer for SiteReliability, you must demonstrate various skills to effectively lead and engage in SRE practices. The successful candidate will act as a point of escalation for critical issues, applying technical expertise to promptly address complex problems in collaboration with additional teams. What you’ll … do: Serve as the SRE Lead's backup, assuming leadership duties when necessary to maintain the continuity and efficiency of SRE operations. Provide day-to-day guidance, support, and informed decision-making for the team, maintaining stability and direction. Serve as a subject matter expert, shaping technical direction, leading initiatives, and mentoring colleagues to build team capability. Stay up to … date with emerging technologies and industry trends, sharing knowledge across company communities to embed SRE best practice. Drive continual improvement by automating manual processes and optimising monitoring systems to achieve full estate coverage. Lead initiatives to improve availability, performance, and scalability through proactive monitoring, capacity planning, and ongoing maintenance. Collaborate with development squads to embed monitoring, reliability, and scalability More ❯
We're Hiring: Mid-Level SiteReliabilityEngineer (SRE) This role would be Fully Remote, Permanent position Are you passionate about automation, observability, and scaling systems to support millions of users? Join ourclients SRE teamwithin thePlatform Engineeringorganization and help us build resilient, secure, and high-performing infrastructure. What Youll Do: Diagnose and resolve complex infrastructure and application … AWS (EC2, ECS, Fargate, Route53, ALB/NLB) Observability: New Relic, Splunk, DataDog IaC: CloudFormation, Terraform, Helm, Ansible, CDK Containers, Kubernetes, Microservices Technical Skills 7+ years in DevOps/SRE roles 3+ years with object-oriented programming (Java/.NET/C++) Expert-level Linux admin and scripting Strong AWS and infrastructure automation experience Passionate about security, automation, and scalable More ❯
Job Title: Platform Engineer/SRE Work Location: Bromley/Chester, UK (Hybrid 3 days in a week) Job Description: We are seeking a Platform Engineer/SRE with a strong and diverse technical background. The ideal candidate will possess hands-on development experience along with SiteReliability Engineering (SRE) expertise. This role requires a proactive … platform stability issues, and develop resilient and reliable systems. Key Responsibilities: Provide hands-on technical leadership in platform engineering initiatives. Ensure platform stability and resilience by identifying and resolving reliability … issues. Collaborate with cross-functional teams to deliver scalable and robust system solutions. Key Skills Required: Strong development experience in Java (primary skill). SiteReliability Engineering ( SRE ) experience. Proficiency with Kafka , Mule , and Oracle Database . Ability to work at a managerial level while remaining hands-on with technical tasks. Nice to Have: Knowledge of payment systems More ❯
Chester, Cheshire West and Chester, Cheshire, United Kingdom Hybrid / WFH Options
Ascendion
Job Title: Platform Engineer/SRE Work Location: Bromley/Chester, UK (Hybrid – 3 days in a week) Job Description: We are seeking a Platform Engineer/SRE with a strong and diverse technical background. The ideal candidate will possess hands-on development experience along with SiteReliability Engineering (SRE) expertise. This role requires a proactive … platform stability issues, and develop resilient and reliable systems. Key Responsibilities: Provide hands-on technical leadership in platform engineering initiatives. Ensure platform stability and resilience by identifying and resolving reliability … issues. Collaborate with cross-functional teams to deliver scalable and robust system solutions. Key Skills Required: Strong development experience in Java (primary skill). SiteReliability Engineering ( SRE ) experience. Proficiency with Kafka , Mule , and Oracle Database . Ability to work at a managerial level while remaining hands-on with technical tasks. Nice to Have: Knowledge of Payments systems More ❯
Job Title: Platform Engineer/SRE Work Location: Bromley/Chester, UK (Hybrid – 3 days in a week) Job Description: We are seeking a Platform Engineer/SRE with a strong and diverse technical background. The ideal candidate will possess hands-on development experience along with SiteReliability Engineering (SRE) expertise. This role requires a proactive … platform stability issues, and develop resilient and reliable systems. Key Responsibilities: Provide hands-on technical leadership in platform engineering initiatives. Ensure platform stability and resilience by identifying and resolving reliability … issues. Collaborate with cross-functional teams to deliver scalable and robust system solutions. Key Skills Required: Strong development experience in Java (primary skill). SiteReliability Engineering ( SRE ) experience. Proficiency with Kafka , Mule , and Oracle Database . Ability to work at a managerial level while remaining hands-on with technical tasks. Nice to Have: Knowledge of payment systems More ❯
globe. What you'll do: As a SiteReliabilityEngineer at Zefr, you'll apply your expertise in cloud infrastructure, CI/CD, Observability, and core SRE concepts, to deliver high-quality, reliable, and scalable solutions. A significant aspect of this role involves working closely with Zefr's Engineering and Data Science teams ensuring the infrastructure required More ❯
Honolulu, Hawaii, United States Hybrid / WFH Options
OMW Consulting
Role - SiteReliabilityEngineer Location - Honolulu - Hybrid - 1-2 days a week on site Security … clearance - Minimum Secret - need this ahead of applying Salary - $150k-$200k + Equity I am partnered with a leading defense tech scale up who are looking to add an SRE to their team based in Hawaii. This role is hybrid with an expectation of 1-2 days on site in Honolulu, however there is some weeks where you will … not need to go on site at all. Due to the nature of the client you must hold an active secret clearance as a minimum ahead of applying for this position. To be considered for this position you must have experience with the following: Experience with Security Clearance and DoD IT Environment: You hold an active security clearance, are More ❯
Reigate, Surrey, England, United Kingdom Hybrid / WFH Options
esure Group
driven insights alongside exceptional service, to deliver personalised experiences that meet our customers ever-changing needs today and in the future. Job Description We are currently recruiting for a SiteReliabilityEngineer to join our Tech Enablement function. The successful candidate will be responsible for our monitoring estate, and for the continuous improvements … and maintenance of it, and to assist in incident investigation and resolution when required. They also share skills within our Tech Enablement team and should be an evangelist for SRE techniques and goals to the broader IT community. What you’ll do: Deliver proactive and reactive activities to meet SLAs and availability. Partner with development squads pre-launch to embed More ❯
want it to go. *** Applicants Must be solely UK National and already hold HMG HLC clearance *** Role Location: Gloucester or Manchester We are seeking a highly skilled and motivated SiteReliability Engineers to join our team. The ideal candidates will possess a good understanding of engineering principals, and broad understanding of full-stack software technologies, with hands-on … and cost optimisation (rightsizing, reserved instances, auto scaling). • Disaster Recovery & Business Continuity Planning Develop and test backup/DR strategies, restore drills, and self healing infrastructure to ensure reliability and uptime. • Collaboration & Knowledge Sharing Work closely with DevOps, development, security and operations teams; prepare architecture/design documents, network diagrams, runbooks and training materials. Required qualifications to be … encryption, audit logging, network isolation, and compliance frameworks. • Monitoring & Optimization Tools: Familiarity with CloudWatch, Grafana, Datadog, Prometheus, ELK or similar The position requires team members to work from client-site to ensure the reliability and availability of critical systems. Together, as owners, let’s turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Fruition Group
Software Engineer/SRE JavaScript/TypeScript, Node.js, AWS, Observability Leeds/Hybrid, c. 2x per week Salary up to £65,000 We're looking for a Software Engineer with strong AWS and Observability experience to join a growing engineering team in Leeds. This is a hybrid role, giving you the flexibility to split your time between home … and a modern city-centre office. You'll work across both engineering and sitereliability, helping to build and scale systems that are reliable, secure, and observable. You'll be a key part of improving platform performance and automation, while collaborating with developers, product teams, and operations. What you'll be doing: Building and maintaining scalable cloud infrastructure … in AWS Implementing and improving observability tools (monitoring, logging, tracing) Automating deployments and improving CI/CD pipelines Driving reliability, availability and performance across systems Working with developers and SREs to solve complex problems What we're looking for: Strong experience with AWS (EC2, ECS, Lambda, RDS etc.) Good knowledge of observability tools (Grafana, Prometheus, OpenTelemetry, Datadog, or similar More ❯
Below are the details of the position: Job Title: Platform Engineer/SRE Work Location: Bromley, UK (Hybrid – 3 days a week) Job Description: 15+ years’ experience in delivering large scale applications with focus on performance, scalability, security, and reliability. Experience in a highly Agile continuous integration and continuous deployment environment, preferably within a financial domain. Strong experience in More ❯