3 of 3 Site Reliability Engineering Jobs in Cambridgeshire

HPC Engineer - Contract via Umbrella - Cambridge/Hybrid

Hiring Organisation
Robson Bale Ltd
Location
Cambridge, Cambridgeshire, United Kingdom
Employment Type
Contract
Contract Rate
GBP Annual
will work closely with the scientific community to deliver high-quality HPC services, leveraging automation, infrastructure-as-code and DevOps practices to ensure scalability, reliability and performance in a rapidly evolving HPC landscape. Responsibilities Design, implement and maintain robust platform infrastructure using Infrastructure as Code (IaC) tools such … Terraform Develop, deliver and operate research computing services and applications Take a Site Reliability Engineering approach to HPC services, managing development, deployment, monitoring and incident response end-to-end Solve complex technical problems related to HPC services and user workflows Drive innovative computational solutions and exploit emerging ...

Principal Developer Team Lead

Hiring Organisation
Cambridge University Press & Assessment
Location
Cambridge, Cambridgeshire, United Kingdom
Employment Type
Permanent
Salary
GBP 51,400 - 68,800 Annual
developers while establishing the foundations for our future technology stack. Your initial focus will be on two strategic priorities: Evolving our SRE function - Building the DevOps infrastructure, automation, and tooling that enables Site Reliability Engineering practices across development and operations teams Advancing our AI development practice - Establishing … education platforms. What You'll Do Technical Leadership Lead migration of legacy applications to cloud-native AWS architectures Build DevOps automation to support SRE practices Establish AI/ML development standards and frameworks Set observability, monitoring, and incident response standards Promote best practices in web, event-driven, and cloud-native ...

Director, Private Cloud Platforms

Hiring Organisation
Jobleads-UK
Location
Cambridge, England, United Kingdom
will lead the strategy, design, and operation of a large-scale Arm-based private cloud environment supporting compute workloads across Arm, including both our Engineering and Enterprise users. Starting with 1,500 servers, and then rolling this out to all existing HPC environments representing a multi-region deployement … Leadership Team level in other organisations. Working with globally distributed teams. Familiarity with EDA engineering environments. Experience with Platform Engineering and SRE Experience integrating schedulers such as IBM LSF or Slurm! In Return: We offer a competitive total reward package, including base salary and equity, ensuring you share ...