Site Reliability Engineer
- Hiring Organisation
- SS&C Technologies
- Location
- Shrewsbury, Shropshire, UK
- Employment Type
- Full-time
week in the office. You'll collaborate closely with engineering, product, and support to design, build, and run robust platforms that meet demanding SLAs / SLOs. What You'll Do Keep production healthy: Monitor, troubleshoot, and resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high … quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus / OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes at scale: Operate and harden Kubernetes (EKS preferred); manage deployments, autoscaling, rollouts / ...