Senior Director - Operations and Reliability Engineering (Hiring Immediately)

Apply Now

Job Description

Locations : Canary Wharf | Boston

Who We Are

Boston Consulting Group partners with leaders in business and society to tackle their most important challenges and capture their greatest opportunities. Founded in 1963, BCG pioneered business strategy and now helps clients with total transformation—driving complex change, enabling growth, building competitive advantage, and delivering bottom-line impact. Success requires blending digital and human capabilities.

Our diverse, global teams bring deep industry and functional expertise and a range of perspectives to spark change. BCG delivers solutions through management consulting, technology and design, corporate and digital ventures, and business purpose. We work collaboratively across all levels of the client organization to generate results that enable clients to thrive.

What You'll Do

The Senior Director – Operations and Reliability Engineering is responsible for integrating Site Reliability Engineering (SRE), DevOps, and traditional operations to develop a next-generation Reliability Engineering function.

This role ensures end-to-end automation at scale, 24x7 operational excellence, and high availability across all BCG entities worldwide. The leader will drive strategic planning, execution, and optimization of global IT infrastructure, cloud operations, and service management, while ensuring a secure, scalable, and efficient technology environment. The role also involves embedding and ensuring compliance with IT Service Management (ITSM) processes across all teams, aligned with standardized frameworks and operational excellence.

Key Responsibilities

Strategic Leadership & Transformation : Define and execute a modern Reliability Engineering strategy, integrating SRE, DevOps, and automation; drive automation to eliminate toil and improve efficiency; lead transition to AI-driven, self-healing infrastructure; establish observability and analytics frameworks; align strategies with business goals.
Infrastructure & Cloud Operations : Oversee IT infrastructure, cloud platforms, and hybrid environments; manage network reliability, compute, and cloud services across AWS, Azure, and GCP; scale Infrastructure as Code (IaC), automation, and workload optimization; implement AI-driven monitoring and self-healing automation.
IT Service Management & Operational Excellence : Mandate adoption of ITSM processes; establish operational metrics including SLOs, SLIs, error budgets; oversee incident response and root cause analysis with AI; ensure high availability, performance, and security compliance; develop a 24/7 operational support model; optimize incident, change, and capacity management; lead Service Asset and Configuration Management (SACM).
Security, Compliance & Risk Management : Embed security and compliance into workflows; ensure adherence to ISO 27001, NIST, SOC 2, GDPR, and cloud security standards; collaborate on zero-trust security models; drive resiliency, disaster recovery, and business continuity initiatives.
Financial & Vendor Management : Optimize operational budgets with a cloud strategy; negotiate vendor contracts; drive cost efficiency in cloud and infrastructure investments.
Leadership & Talent Development : Build and mentor a high-performing Reliability Engineering team; foster a culture of automation and innovation; promote a collaborative, data-driven, proactive mindset; develop workforce programs for AI-driven operations and modern reliability practices.

What You'll Bring

Required Qualifications : 15+ years in IT operations, SRE, DevOps, or platform engineering; 5+ years in senior leadership managing large-scale IT environments; deep expertise in cloud computing (AWS, Azure, GCP), on-prem, and hybrid; proven experience in automation, IaC, observability, and AI-driven IT operations; strong understanding of security, compliance, and risk management; excellent leadership and stakeholder management skills.

Preferred Certifications : ITIL, AWS/Azure/GCP Solutions Architect, SRE Foundation, CISSP, or similar; experience with Kubernetes, Terraform, Ansible, and AI operations tools; strong problem-solving skills with a data-driven approach.

Additional Information

This pivotal leadership role involves shaping the future of IT operations by integrating SRE, DevOps, and automation methodologies. If you are a technically skilled, innovation-driven leader passionate about scaling operations through automation and AI resilience, we encourage you to apply.

Work Environment & Additional Details : Hybrid or on-site; occasional travel; fast-paced, high-availability environment focused on automation and reliability.

Boston Consulting Group is an Equal Opportunity Employer. All qualified applicants will receive consideration without regard to various protected characteristics. For more information, click here .

#J-18808-Ljbffr

Company: ZipRecruiter
Location: London, UK
Employment Type: Full-time
Posted: 13 hours ago

Apply Now

Company: ZipRecruiter
Location: London, UK
Employment Type: Full-time
Posted: 13 hours ago