26 to 28 of 28 Distributed Computing Jobs in the UK

Senior DevOps Engineer

Hiring Organisation
Humanoid
Location
City of London, London, United Kingdom
operating multi-GPU, cross-cloud platforms that enable efficient, reliable, and scalable model training. You’ll work at the intersection of DevOps, MLOps, and distributed systems, helping push the limits of real-world AI. What You’ll Do: Design, build, and operate scalable multi-GPU infrastructure across cloud environments … code and automation for provisioning, orchestration, and lifecycle management Build and evolve CI/CD pipelines for both infrastructure and ML training workflows Optimize distributed training workloads (scheduling, resource utilization, observability) Ensure high standards of reliability, scalability, security, and monitoring across systems Collaborate with ML engineers and researchers ...

Staff DevOps Engineer

Hiring Organisation
Humanoid
Location
City of London, London, United Kingdom
multi-GPU, cross-cloud platforms, driving architecture, reliability, and performance at scale. This role sits at the intersection of DevOps, MLOps, and distributed systems, enabling cutting-edge AI in real-world environments. What You’ll Do: Lead the design and evolution of scalable multi-GPU infrastructure across cloud environments … code and automation for provisioning, orchestration, and lifecycle management Architect and improve CI/CD systems for both infrastructure and ML training workflows Optimize distributed training workloads (scheduling, resource utilization, observability) Partner with ML engineers and researchers to enable efficient experimentation and productionization Lead troubleshooting and resolution of complex ...

Platform Engineer: £120k + Bonus/benefits (AI Trading)

Hiring Organisation
Hunter Bond
Location
London Area, United Kingdom
operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack … Experience with modern software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ...