traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and highavailability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and … native services across AWS, Azure, and GCP. * Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. * Drive edge computing, containerized workloads, and high-performance computing strategies. * Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: * Mandate and assure the adoption … SRE-based operational metrics, including SLOs, SLIs, and error budgets. * Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. * Ensure highavailability, performance, and security compliance for all enterprise services. * Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across More ❯
london (city of london), south east england, United Kingdom Hybrid / WFH Options
Fruition Group
shaping the infrastructure and operational strategy for one of the most innovative businesses in their market. Working with cutting-edge technology, this role offers high-impact challenges, meaningful collaboration, and excellent career progression. Senior SRE Responsibilities Manage and optimise cloud infrastructure to ensure scalability, highavailability, and … as PowerShell or Python. Champion infrastructure best practices and mentor junior team members. Senior SRE Requirements Extensive experience in SRE or DevOps roles within high-availability, cloud-native environments. Strong expertise with AWS (including EKS, MSK, RDS, VPC design, encryption, and IAM). Experience with Kubernetes and Argo More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Motability Operations
privileged access planning. This is a hands-on technical leadership role, accountable for guiding a multidisciplinary team of engineers and contractors, ensuring consistent delivery, high service availability, and secure integration of identity services into core business platforms. The role also plays a key part in the transition from … improvement. You have a proven ability to support or lead identity platform delivery across a broad estate, and you understand the importance of maintaining highavailability in customer-facing authentication services. You are confident guiding others in the design and delivery of joiner-mover-leaver (JML) automation, access … reviews, SoD frameworks, and privileged access strategy. You're comfortable setting direction, influencing architecture, and ensuring that the team consistently delivers to a high standard. You work effectively with cross-functional teams-from architects to compliance to app owners-and are able to represent IAM priorities in technical forums More ❯
Employment Type: Permanent, Part Time, Work From Home
in cloud based environments: Develop and implement robust IT architecture strategies for cloud and hybrid environments, leveraging AWS best practices. Design scalable, secure, and high-availability solutions tailored to business needs. Architect and optimize data platforms to enable efficient data collection, storage, and processing. Implement and manage cloud More ❯
to you. This is a unique opportunity to shape, direct and build better financial services for all UK businesses, whilst being part of a high growth tech company. Backed by leading global VCs, and brings together seasoned, experienced payment, banking and tech industry professionals who are aiming to redefine … Cloud Native and Serverless Architecture You will collaborate with other stakeholders and manage suppliers to fulfil business requirements through system enhancements and maintenance. Ensuring highavailability in production systems is a critical responsibility of this role. To excel in this role, you should be: A people-oriented leader … who is easy to work with, adaptable and flexible while maintaining high standards. Supported by global teams of engineers, QAs, SREs, and DevOps professionals. Comfortable managing a mix of direct reports and shared teams Skills & Experiences AWS Technologies (e.g. ECS, DynamoDB, Lambda, Aurora, SQS, SNS, VPC, Private Link, etc. More ❯
Lead ClickHouse implementation at a top-tier global investment bank Own a high-impact role with long-term growth and visibility About Our Client My client is a prestigious global financial services firm with a strong reputation for innovation in banking and investment solutions. They are committed to sustainability … of services to clients across various industries, with a focus on cutting-edge technology and data-driven decision-making. Job Description Design and implement high-performance ClickHouse environments Drive the migration of analytical databases to ClickHouse with minimal disruption Manage database query optimisation for scalability and performance Operate production … ready clusters with highavailability and fault tolerance Develop and maintain backup, disaster recovery, and data security strategies Collaborate with teams to align database solutions with business goals Monitor performance with Prometheus and Grafana , addressing issues proactively Implement best practices for data modelling and processing Guide and mentor More ❯
london (city of london), south east england, United Kingdom Hybrid / WFH Options
Realtime Recruitment
/Azure)/on-premise ecosystem and seek a Network Engineer for design, implementation, and stewardship of our hosting environments. This role emphasizes observability, high-availability, security, and infrastructure-as-code, involving collaboration with TechOps, DevOps, and InfoSec, and support of existing/acquired assets. Responsibilities Implement technical More ❯
with databases (MySQL, MongoDB) Ability to quickly learn new technologies Good understanding of security practices Nice to have Experience with blockchain integration Comfortable with high-availability concepts Ruby, Rust or C++ skills are a plus Other technologies of interest: Message queues (Redis), Caches and Job Queues More ❯