Site Reliability Engineer
Mid-Level Site Reliability Engineer (SRE)
Are you an experienced Site Reliability Engineer with a passion for building reliable, scalable systems that empower innovation? Our client is looking for a skilled Mid-Level SRE to join our growing technology team.
In this role, you’ll help ensure our infrastructure is stable, secure, and efficient - supporting the applications that drive support our clients.
The Role
We are seeking a mid-level Site Reliability Engineer (SRE) to join our technology team, helping to ensure the smooth operation and reliability of our infrastructure. You’ll play a vital role in maintaining uptime, managing deployments, and supporting other team members.
This is a hands-on position suited for someone who thrives on problem-solving, process improvement, and cross-team communication.
What You’ll Do:
Maintain & Improve Systems
- Ensure the reliability, performance, and availability of production systems.
- Perform regular updates, patching, and maintenance across environments.
- Manage infrastructure provisioning using Terraform, Ansible, and AWS.
Collaborate & Support
- Work closely with the junior SRE to develop their practical experience and technical confidence.
- Partner with developers, data scientists, and business users to resolve technical issues.
Automate & Optimise
- Contribute to configuration management and automation improvements.
- Identify and document standard operating procedures.
- Implement proactive monitoring measures to detect and prevent issues.
Monitor & Troubleshoot
- Troubleshoot system issues using logs, monitoring tools, and a methodical approach.
- Oversee and enhance system monitoring with Nagios, with a transition to Datadog.
Incident Management
- Support incident management processes, including post-mortems and follow-up actions.
- Communicate outcomes with customers clearly and effectively.
What We’re Looking For:
Experience
- Proven experience in an SRE, DevOps, or Operations Engineering role.
- Strong working knowledge of AWS, Terraform, and Ansible.
Technical Skills
- Linux system administration & shell scripting.
- Networking fundamentals, containerization, and infrastructure security best practices.
- Version control experience (e.g., Git). Strong troubleshooting and root cause analysis skills.
Desirable Skills
- Experience with Kubernetes and/or other cloud platforms.
- Familiarity with Nagios, Datadog, or similar monitoring tools.
- Exposure to CI/CD systems such as TeamCity, AWS CodeBuild, AWS CodePipeline, or ArgoCD.
Personal Attributes
- Proactive, curious, and process-driven.
- Enjoys collaboration and mentoring.
- Calm under pressure, especially during incidents.
- Flexible and adaptable to technical and business priorities.
Nice-to-Have
- Experience supporting scientific or data-intensive applications.
- Background in post-mortem facilitation and follow-up.
- Enthusiasm for observability, performance tuning, and cost optimisation.
- Company
- BOSS Professional Services LTD
- Location
- Macclesfield, Cheshire, England, United Kingdom
- Employment Type
- Full-Time
- Salary
- £50,000 - £60,000 per annum
- Posted
- Company
- BOSS Professional Services LTD
- Location
- Macclesfield, Cheshire, England, United Kingdom
- Employment Type
- Full-Time
- Salary
- £50,000 - £60,000 per annum
- Posted