Site Reliability Engineer
Southampton, Hampshire, United Kingdom
Hybrid / WFH Options
Hybrid / WFH Options
NICE
Run the production environment by monitoring availability and taking a holistic view of system health Build software and systems to manage platform infrastructure and applications Improve reliability, quality, and time-to-market of our suite of software solutions Measure and optimize system performance, with an eye toward pushing our … capabilities forward, getting ahead of customer needs, and innovating to continually improve Provide primary operational support and engineering for multiple large distributed software applications How will you make an impact? Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault … with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch). Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems. Experience of Incident management and blameless postmortems that includes driving the incident response efforts during outages and other critical incidents, resolution, and communication More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: