Lead Site Reliability Engineer Sunderland, UK
Sunderland, United Kingdom
Tombola
collaborate closely with our development, infrastructure, and security teams, balancing exciting new feature delivery with rock-solid system stability. Key Accountabilities and Responsibilities: Team Leadership and Development Providing leadership, management, and development for direct reports through effective 1-to-1s, objective setting (OKRs), and performance management. Making team goals clear and ensuring they align with our broader business … objectives. Collaborating with other teams and departments to achieve shared success. Partnering with our People Partner for tech to build robust team management practices. System Reliability and Availability Ensure system uptime: Monitor and maintain the availability and reliability of critical systems and services, meeting all uptime SLAs (Service Level Agreements). Incident management: Quickly respond to incidents, investigate … root causes, and ensure effective postmortems and continuous improvement processes are in place. Failure detection and response: Proactively identify potential failures or performance bottlenecks before they impact users, and respond to failures and outages effectively. Monitoring and Alerting Implement monitoring systems: Set up and maintain robust monitoring systems (e.g., Dynatrace) for application performance, infrastructure health, and system metrics. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: