Site Reliability Engineering (SRE) Manager
- Hiring Organisation
- Halian Technology Limited
- Location
- United Kingdom
- Employment Type
- Permanent, Work From Home
major incidents, mitigation, RCA, and preventative improvements Own and refine SLIs, SLOs, and error budgets Reduce operational toil through automation Deep-dive Linux debugging, performance tuning, and systems analysis Strengthen observability, monitoring, and alerting Provide technical leadership to a small SRE/engineering group Improve and manage … development teams to build reliability into system design What Youll Bring Strong AWS experience (EC2, networking, autoscaling, IAM, load balancing) Deep Linux troubleshooting skills (performance, networking, debugging) Real 24/7 production on-call experience Hands-on incident management and postmortems Experience mentoring or leading a small technical team ...