Site Reliability Engineer

Site Reliability Engineer

Location: Gloucester (Hybrid & Flexible working options available)

About the Role

Our National Security business in Gloucester is expanding, offering increased opportunities to work with high-profile clients on solutions with significant real-world impact. You will join a growing team that prioritizes both client delivery and community engagement, including local outreach to build tech and cyber skills in the region.

As a Site Reliability Engineer (SRE), you will bridge the gap between software engineering and systems operations. You will use software expertise to automate tasks and reduce manual labor, ensuring that traditional operations work (incident tickets, on-call, etc.) occupies no more than 50% of your time.

Core Accountabilities

  • Service Maintenance: Support and maintain essential services for core mission applications, proactively enhancing availability, performance, and stability.
  • Automation & Innovation: Replace repetitive manual work with innovative, automated solutions.
  • Consultancy: Work alongside product teams to advise on design and build best practices, ensuring systems are scalable and resilient.
  • Monitoring & Instrumentation: Implement application monitoring to demonstrate daily improvements and ensure deep visibility into system health.
  • Community Engagement: Actively participate in the internal DevOps and SRE communities to share knowledge and drive standards.

Technical Background & Experience

We are looking for candidates who typically possess experience in the following areas:

  • Software Development: Proficiency in Java and web technologies (JavaScript, HTML).
  • Databases: Familiarity with technologies such as Elastic and Mongo.
  • Operating Systems: Strong command line skills in Linux (Bash) and Windows (PowerShell).
  • Cloud Infrastructure: Hands-on experience with AWS, Azure, or OpenStack.
  • Deployment & Configuration: Use of tools like Chef and Puppet.
  • Monitoring: Expertise in monitoring large-scale systems using the ELK stack.
  • Agile Methodology: Experience working in an Agile Scrum team and using supporting tools like Jira.
  • Troubleshooting: Proven ability to diagnose and resolve application issues and service outages across various levels of the stack.
  • Modern Architecture: Experience with container management (Docker) and micro-services.
  • Open Source: Experience extending and improving Open Source Software (OSS).
  • Testing: Familiarity with automation test frameworks such as Selenium.

Culture and Benefits

  • Hybrid Working: We embrace flexible arrangements, allowing for a balance of office-based, client-site, and remote work to enhance well-being.
  • Inclusion: We welcome candidates from all backgrounds and are committed to making our recruitment process accessible. Reasonable adjustments are available for candidates with disabilities or health conditions.
  • Security Clearance: Please note that these roles are subject to security and export control restrictions. Applicants must, at minimum, achieve Baseline Personnel Security Standard (BPSS). Many roles require higher levels of National Security Vetting, typically requiring 5 to 10 years of continuous UK residency.

Job Details

Company
Anson McCade
Location
Gloucester, England, United Kingdom
Hybrid / Remote Options
Posted