Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
at scale. In this role, you'll partner with the team to develop new functionality and modernise existing infrastructure, design pipelines, improve monitoring, logging and alerting, and ensure our systems remain performant, secure, and efficient. Responsibilities: This is a hands-on role with high ownership. You'll collaborate across teams to: Modernise our infrastructure by leading the migration from … speed - at once. You'll be key to making it work. Required Skills: Knowledge of one or more programming languages (Java/Scala, TypeScript, Python). Validated experience operating distributedsystems at scale in production. Cloud AWS (primary), Kubernetes (future), Docker (current), Terraform. Excellent debugging skills across network, systems, and data stack. Observability tooling, e.g. custom metrics … optimising Kafka, Spark, or container networking under load. In Return: This role is at the heart of a highly leveraged platform, enabling hundreds of engineers to use critical data systems with confidence. You'll have ownership, impact, and a seat at the table as we define how SRE and platform thinking shape our next-generation data infrastructure. If you More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
should possess a robust background in deploying and running Kubernetes clusters across multi-cloud and hybrid-cloud environments at scale. Responsibilities: Deploy and manage enterprise-scale Kubernetes clusters hosting distributed application services in a multi-tenant environment. Work proficiently with multi-cloud or hybrid-cloud setups, including data centers virtualisation technologies like VMware, Harvester, OpenStack, or others. Contribute to … Certified Kubernetes Application Developer (CKAD) certifications. Experience working automated deployment and management EKS clusters Specialist or Architect level certifications in AWS, GCP, and Azure. Experience in deploying and supporting distributedsystems on Kubernetes. In Return: Based in Cambridge UK, this is an opportunity to join a dynamic, collaborative, and driven team, and provides a genuine opportunity to craft More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Microsoft Corporation
engineering, and natural sciences, which again emphasises the importance of collaboration and teamwork. We are seeking a highly motivated and experienced Sr RSDE with expertise in machine learning and distributed systems. The ideal candidate will have a deep understanding of machine learning and be proficient in the design, planning, and implementation of tools and technology to support AI-driven … source ecosystem. Proficient experience working with machine learning and large datasets. In-depth understanding of open source machine learning frameworks (e.g., PyTorch, ggml, llama.cpp, vllm). Experience building complex systems on the cloud. Experience building and optimizing distributedsystems and large-data applications, including those using tensor accelerators or GPUs. Strong analytical, problem-solving, and communication skills. … required, but preferred. for Science Responsibilities Architect, design, and implement scalable and robust solutions for machine learning and scientific research involving large volumes of heterogeneous data. Build and optimize distributed data processing and model building pipelines. Develop and maintain tools and technologies for building, training, optimizing, scaling machine learning solutions. Collaborate with cross-functional teams, including scientists, researchers, and More ❯