Lead Site Reliability Engineer (Hiring Immediately)
Social network you want to login/join with:
This role plays a key part in the global follow-the-sun support model, working closely with the Global SRE Leader to support platforms worldwide. We are looking for SRE talent with experience in an On-Prem / Datacenter environment.
The ideal candidate will bring strong technical leadership, experience in an On-Prem / Datacenter environment, and a passion for operational excellence to a high-impact team. You'll collaborate with Engineering, Infrastructure, and Operations teams to maintain high availability and resilient service delivery, while also mentoring a SRE team focused on continuous improvement and innovation.
Key Responsibilities:
Technical Leadership
- Develop deep expertise in the Titanium trading platform to lead and support critical business operations.
- Oversee team workload, ensuring priorities align with business goals and resource capacity.
Operational Excellence
- Champion initiatives that enhance system availability, scalability, and performance.
- Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., Capacity Planning, Change Management, Disaster Recovery).
Cross-Functional Collaboration
- Partner with Software Engineering, Infrastructure, Operations, Security, and Business teams to deliver secure and reliable platforms.
Team Development
- Build, lead, and mentor a high-performing SRE team in Europe, fostering a culture of ownership, collaboration, and innovation.
- Lead response efforts for critical incidents, ensuring swift resolution and comprehensive root cause analysis.
- Drive long-term improvements based on lessons learned from Learning Reviews, and maintain accurate incident documentation and compliance reporting.
- Lead automation initiatives to streamline workflows and increase uptime.
- Use Jira to manage tasks and projects, and align global SRE practices for seamless support.
Capacity Planning
- Drive timely capacity planning to prevent last-minute issues.
- Support budget planning to align infrastructure investments with growth and performance targets.
- Participate in quarterly capacity reviews and follow up on outcomes.
Monitoring & Analytics
- Oversee the implementation of monitoring and alerting systems to detect and resolve issues proactively—before customer or compliance impacts occur.
Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred)
- 5+ years in a technical SRE, DevOps Position
- 2+ years in a leadership or senior engineering capacity
Preferred Skills:
- Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake)
- Experience in AWS, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi)
For more information, please apply with a relevant CV.
#J-18808-Ljbffr- Company
- JR United Kingdom
- Location
- London, UK
- Employment Type
- Full-time
- Posted
- Company
- JR United Kingdom
- Location
- London, UK
- Employment Type
- Full-time
- Posted