Design and deploy scalable, high-performance cloud infra for ML workloads Build and manage GPU clusters, storage systems, and distributed training environments Set up and optimise containerised workflows (Docker, Kubernetes, Terraform) Implement robust monitoring, incident response, and CI/CD practices Collaborate closely with researchers to integrate and scale experiments This person must have experience building ML Infrastructure and cloud … architecture from scratch Key Details: Salary: £100k–£130k (flexible for strong profiles) Working Model: On-site, London Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you. More ❯
Design and deploy scalable, high-performance cloud infra for ML workloads Build and manage GPU clusters, storage systems, and distributed training environments Set up and optimise containerised workflows (Docker, Kubernetes, Terraform) Implement robust monitoring, incident response, and CI/CD practices Collaborate closely with researchers to integrate and scale experiments This person must have experience building ML Infrastructure and cloud … architecture from scratch Key Details: Salary: £100k–£130k (flexible for strong profiles) Working Model: On-site, London Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you. More ❯
Java 🛠️ Lead on requirements, design workshops, and solutioning 🤝 Mentor junior engineers and share best practices ☁️ Implement & optimise microservices and distributed systems on GCP (GKE, PubSub, BigQuery, Dataflow) using Docker & Kubernetes 🔄 Build and manage efficient data pipelines with streaming tech + relational/NoSQL databases ✅ Ensure high quality through robust unit, integration, and non-functional testing ⚡ Contribute to CI/CD … OOP experience 📊 Background in data platforms, frameworks & streaming technologies 🧩 Familiarity with microservices & distributed systems ☁️ Exposure to GCP (GKE, PubSub, BigQuery) 🗄️ Experience with relational/NoSQL databases 🐳 Proficiency with Docker & Kubernetes 🔄 Solid understanding of engineering best practices (CI/CD, Git, testing) 🧠 Problem-solving mindset & curiosity for new technologies 👉 If this sounds like your kind of challenge, I’d love to More ❯
Java 🛠️ Lead on requirements, design workshops, and solutioning 🤝 Mentor junior engineers and share best practices ☁️ Implement & optimise microservices and distributed systems on GCP (GKE, PubSub, BigQuery, Dataflow) using Docker & Kubernetes 🔄 Build and manage efficient data pipelines with streaming tech + relational/NoSQL databases ✅ Ensure high quality through robust unit, integration, and non-functional testing ⚡ Contribute to CI/CD … OOP experience 📊 Background in data platforms, frameworks & streaming technologies 🧩 Familiarity with microservices & distributed systems ☁️ Exposure to GCP (GKE, PubSub, BigQuery) 🗄️ Experience with relational/NoSQL databases 🐳 Proficiency with Docker & Kubernetes 🔄 Solid understanding of engineering best practices (CI/CD, Git, testing) 🧠 Problem-solving mindset & curiosity for new technologies 👉 If this sounds like your kind of challenge, I’d love to More ❯
Design and deploy scalable, high-performance cloud infra for ML workloads Build and manage GPU clusters, storage systems, and distributed training environments Set up and optimise containerised workflows (Docker, Kubernetes, Terraform) Implement robust monitoring, incident response, and CI/CD practices Collaborate closely with researchers to integrate and scale experiments This person must have experience building ML Infrastructure and cloud … architecture from scratch Key Details: Salary: £100k–£130k (flexible for strong profiles) Working Model: On-site, London Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you. More ❯
Java 🛠️ Lead on requirements, design workshops, and solutioning 🤝 Mentor junior engineers and share best practices ☁️ Implement & optimise microservices and distributed systems on GCP (GKE, PubSub, BigQuery, Dataflow) using Docker & Kubernetes 🔄 Build and manage efficient data pipelines with streaming tech + relational/NoSQL databases ✅ Ensure high quality through robust unit, integration, and non-functional testing ⚡ Contribute to CI/CD … OOP experience 📊 Background in data platforms, frameworks & streaming technologies 🧩 Familiarity with microservices & distributed systems ☁️ Exposure to GCP (GKE, PubSub, BigQuery) 🗄️ Experience with relational/NoSQL databases 🐳 Proficiency with Docker & Kubernetes 🔄 Solid understanding of engineering best practices (CI/CD, Git, testing) 🧠 Problem-solving mindset & curiosity for new technologies 👉 If this sounds like your kind of challenge, I’d love to More ❯
Java 🛠️ Lead on requirements, design workshops, and solutioning 🤝 Mentor junior engineers and share best practices ☁️ Implement & optimise microservices and distributed systems on GCP (GKE, PubSub, BigQuery, Dataflow) using Docker & Kubernetes 🔄 Build and manage efficient data pipelines with streaming tech + relational/NoSQL databases ✅ Ensure high quality through robust unit, integration, and non-functional testing ⚡ Contribute to CI/CD … OOP experience 📊 Background in data platforms, frameworks & streaming technologies 🧩 Familiarity with microservices & distributed systems ☁️ Exposure to GCP (GKE, PubSub, BigQuery) 🗄️ Experience with relational/NoSQL databases 🐳 Proficiency with Docker & Kubernetes 🔄 Solid understanding of engineering best practices (CI/CD, Git, testing) 🧠 Problem-solving mindset & curiosity for new technologies 👉 If this sounds like your kind of challenge, I’d love to More ❯
london (city of london), south east england, united kingdom
Arrows
Java 🛠️ Lead on requirements, design workshops, and solutioning 🤝 Mentor junior engineers and share best practices ☁️ Implement & optimise microservices and distributed systems on GCP (GKE, PubSub, BigQuery, Dataflow) using Docker & Kubernetes 🔄 Build and manage efficient data pipelines with streaming tech + relational/NoSQL databases ✅ Ensure high quality through robust unit, integration, and non-functional testing ⚡ Contribute to CI/CD … OOP experience 📊 Background in data platforms, frameworks & streaming technologies 🧩 Familiarity with microservices & distributed systems ☁️ Exposure to GCP (GKE, PubSub, BigQuery) 🗄️ Experience with relational/NoSQL databases 🐳 Proficiency with Docker & Kubernetes 🔄 Solid understanding of engineering best practices (CI/CD, Git, testing) 🧠 Problem-solving mindset & curiosity for new technologies 👉 If this sounds like your kind of challenge, I’d love to More ❯
Maidstone, Kent, United Kingdom Hybrid / WFH Options
Gold Group
with an in-depth knowledge of Design and development of software solutions using .NET Framework, .NET Core, and .NET 6+, CI/CD pipelines, Azure DevOps, Terraform, Docker and Kubernetes (AKS). The successful candidate will be responsible for building scalable and maintainable software systems, while also contributing to CI/CD pipeline design, infrastructure automation, and testing integration * Salary … to have the following: * Experience in software development with a strong understanding of DevOps engineering. * Proficiency in CI/CD pipeline design and implementation. * Hands-on experience with Docker, Kubernetes, and Azure DevOps. * Strong knowledge of .NET development and Visual Studio environments. * Experience with infrastructure as code using Terraform. * Familiarity with automated testing frameworks and integration into pipeline This really More ❯
the maintenance of existing legacy systems and support the transition to a cloud-native architecture. This role requires expertise in CI/CD automation, container orchestration using Docker and Kubernetes, Linux system administration, and infrastructure-as-code in cloud-agnostic environments. The position will require an additional security scrub prior to onboarding. Tasks Performed: • Automate build, test, and deployment processes … alerting tools. Education, Skills and Qualifications: • Demonstrated 5+ years of DevOps experience. • Demonstrated 3+ years of Continuous Integration/Continuous Development (CI/CD) experience. • Demonstrated 3+ years of Kubernetes and Docker experience. • Demonstrated experience in supporting cloud agnostic environments. • Demonstrated experience in integrating automatic test tools into the CI/CD pipeline. • Demonstrated experience with Linux Operating Systems. • Experience More ❯
London, England, United Kingdom Hybrid / WFH Options
Harnham
involve: Building and maintaining scalable cloud infrastructure (AWS, GCP, Azure) for ML workloads and APIs Setting up ML nodes for distributed training and local development Managing containerised environments (Docker, Kubernetes, Terraform) Optimising storage for big data pipelines supporting ML workloads Monitoring systems and responding to incidents, ensuring reliability and performance Working closely with ML engineers and researchers to integrate infra … in cloud engineering, ideally with ML-related workloads Proficiency in scripting (Bash, PowerShell, Python) Start-up/Scale-up Experience Strong cloud skills (AWS, GCP, Azure) and containerisation (Docker, Kubernetes) Experience in automating deployments and orchestrating cloud environments Nice to have: Python (Jupyter, PyTorch), monitoring tools (Prometheus, Grafana), cloud databases (RDS, Aurora, Spanner), CI/CD tools (CircleCI), and data More ❯
Senior Software Engineer Tech Stack: C#, .NET, Microservices, Docker, Kubernetes, AWS/Azure United Kingdom Full Remote: Perm Role Salary: £85,000 - £110,000 Harrington Starr has partnered with an innovative start up FinTech who specialise in providing market access and data from their ultra low-latency platform to financial service firms, leading exchanges and single dealers across digital assets … products. Key Requirements: Extensive C#, .NET development experience (5+ years) Strong experience developing highly scalable, low-latency applications Good hands on development experience with cloud based, microservice architecture (Docker, Kubernetes) Extensive experience with TDD and Event Driven Development. Independent self starter, confident with taking ownership of developing greenfield applications Strong collaborator and communicator 2:1 in BSc Computer Science or More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
involve: Building and maintaining scalable cloud infrastructure (AWS, GCP, Azure) for ML workloads and APIs Setting up ML nodes for distributed training and local development Managing containerised environments (Docker, Kubernetes, Terraform) Optimising storage for big data pipelines supporting ML workloads Monitoring systems and responding to incidents, ensuring reliability and performance Working closely with ML engineers and researchers to integrate infra … in cloud engineering, ideally with ML-related workloads Proficiency in scripting (Bash, PowerShell, Python) Start-up/Scale-up Experience Strong cloud skills (AWS, GCP, Azure) and containerisation (Docker, Kubernetes) Experience in automating deployments and orchestrating cloud environments Nice to have: Python (Jupyter, PyTorch), monitoring tools (Prometheus, Grafana), cloud databases (RDS, Aurora, Spanner), CI/CD tools (CircleCI), and data More ❯
Employment Type: Full-Time
Salary: £140,000 - £160,000 per annum, Inc benefits
slough, south east england, united kingdom Hybrid / WFH Options
Harnham
involve: Building and maintaining scalable cloud infrastructure (AWS, GCP, Azure) for ML workloads and APIs Setting up ML nodes for distributed training and local development Managing containerised environments (Docker, Kubernetes, Terraform) Optimising storage for big data pipelines supporting ML workloads Monitoring systems and responding to incidents, ensuring reliability and performance Working closely with ML engineers and researchers to integrate infra … in cloud engineering, ideally with ML-related workloads Proficiency in scripting (Bash, PowerShell, Python) Start-up/Scale-up Experience Strong cloud skills (AWS, GCP, Azure) and containerisation (Docker, Kubernetes) Experience in automating deployments and orchestrating cloud environments Nice to have: Python (Jupyter, PyTorch), monitoring tools (Prometheus, Grafana), cloud databases (RDS, Aurora, Spanner), CI/CD tools (CircleCI), and data More ❯
nodes for local and distributed training. Install, configure, and monitor servers, ensuring system reliability. Design and optimize storage solutions for large-scale ML datasets. Manage containerized applications with Docker, Kubernetes, Terraform, and related tools. Collaborate with ML engineers and researchers to ensure seamless orchestration of training and production environments. Troubleshoot and respond to cloud/production incidents, implementing long-term … Strong scripting skills (Bash, PowerShell, Python, etc.) for automation. Proven expertise in at least one major cloud platform (AWS, GCP, or Azure). Experience with containerization and orchestration (Docker, Kubernetes). Ability to manage and optimize large-scale cloud infrastructure. Familiarity with Python (Jupyter) and ML frameworks (e.g., PyTorch). Experience with cloud monitoring tools (Prometheus, Grafana). Exposure to More ❯
nodes for local and distributed training. Install, configure, and monitor servers, ensuring system reliability. Design and optimize storage solutions for large-scale ML datasets. Manage containerized applications with Docker, Kubernetes, Terraform, and related tools. Collaborate with ML engineers and researchers to ensure seamless orchestration of training and production environments. Troubleshoot and respond to cloud/production incidents, implementing long-term … Strong scripting skills (Bash, PowerShell, Python, etc.) for automation. Proven expertise in at least one major cloud platform (AWS, GCP, or Azure). Experience with containerization and orchestration (Docker, Kubernetes). Ability to manage and optimize large-scale cloud infrastructure. Familiarity with Python (Jupyter) and ML frameworks (e.g., PyTorch). Experience with cloud monitoring tools (Prometheus, Grafana). Exposure to More ❯
nodes for local and distributed training. Install, configure, and monitor servers, ensuring system reliability. Design and optimize storage solutions for large-scale ML datasets. Manage containerized applications with Docker, Kubernetes, Terraform, and related tools. Collaborate with ML engineers and researchers to ensure seamless orchestration of training and production environments. Troubleshoot and respond to cloud/production incidents, implementing long-term … Strong scripting skills (Bash, PowerShell, Python, etc.) for automation. Proven expertise in at least one major cloud platform (AWS, GCP, or Azure). Experience with containerization and orchestration (Docker, Kubernetes). Ability to manage and optimize large-scale cloud infrastructure. Familiarity with Python (Jupyter) and ML frameworks (e.g., PyTorch). Experience with cloud monitoring tools (Prometheus, Grafana). Exposure to More ❯
london (city of london), south east england, united kingdom
Harnham
nodes for local and distributed training. Install, configure, and monitor servers, ensuring system reliability. Design and optimize storage solutions for large-scale ML datasets. Manage containerized applications with Docker, Kubernetes, Terraform, and related tools. Collaborate with ML engineers and researchers to ensure seamless orchestration of training and production environments. Troubleshoot and respond to cloud/production incidents, implementing long-term … Strong scripting skills (Bash, PowerShell, Python, etc.) for automation. Proven expertise in at least one major cloud platform (AWS, GCP, or Azure). Experience with containerization and orchestration (Docker, Kubernetes). Ability to manage and optimize large-scale cloud infrastructure. Familiarity with Python (Jupyter) and ML frameworks (e.g., PyTorch). Experience with cloud monitoring tools (Prometheus, Grafana). Exposure to More ❯
Isleworth, London, United Kingdom Hybrid / WFH Options
Staffworx Limited
real-world applications. Write code that lasts clean, maintainable, well-tested, and resilient. Shape our infrastructure through modern DevOps practices: CI/CD pipelines (Jenkins, Concourse), containerisation, Helm, and Kubernetes deployments. Collaborate in Agile teams, driving technical direction, reviewing code, and suggesting smart process improvements. Keep our systems secure, observable, and high-performing at scale. Mentor and inspire junior engineers … TypeScript and Node.js . Strong understanding of GraphQL from schema design to integration and evolution. Hands-on with testing frameworks like Vitest and Playwright . Comfortable with Docker , Helm , Kubernetes , and modern CI/CD pipelines. Experience in cloud-native architectures and infrastructure-as-code . A knack for improving codebases and influencing technical direction. Experience mentoring or coaching other More ❯
Azure) and setting up ML nodes for both local development and distributed training. Optimizing storage systems to handle big data for ML. Building and scaling containerized applications with Docker, Kubernetes, Terraform, etc. Responding to production incidents and driving long-term solutions. Working closely with ML engineers to orchestrate smooth development and production workflows. What they're looking for: 3+ years … experience in a cloud-related role (ML-related is a big plus). Strong scripting ability (Bash, Python, PowerShell, etc.) for automation. Hands-on experience with containerization & orchestration (Docker, Kubernetes, Terraform). Skilled in cloud platforms (AWS, GCP, or Azure). Bonus points for: Familiarity with ML frameworks (PyTorch, Jupyter). Knowledge of cloud monitoring tools (Prometheus, Grafana). Experience More ❯
As a Senior Linux DevOps Engineer, you'll shape a highly available, secure, and automated infrastructure, by spanning global data centers to containerized Kubernetes evironments. You'll work with modern tools like Prometheus, Grafana Loki, Ansible, and NGINX, directly impacting system performance, stability, and security. What makes this role stand out is the mix of deep technical work, freedom to … across global data centers, ensuring uptime, implementing failover strategies, and applying security best practices (e.g., SELinux, firewallD, OpenSCAP). Assist in the operation and monitoring of containerized workloads in Kubernetes environments. Operate and enhance Utimaco's cybersecurity platform, ensuring seamless service delivery Requirements Bachelor's degree in a relevant field or equivalent practical experience, with a proven track record as More ❯
Lexington, Massachusetts, United States Hybrid / WFH Options
Aquila Technology
Government standards. What You'll Bring: 5 years of Running/maintaining databases/data stores (e.g., MySQL, InfluxDB, Elasticsearch) 5 years of Virtualization and containerization (e.g., VMware, Docker, Kubernetes, Podman) 5 years of Linux system administration (RedHat + Ubuntu) 5 years of Windows system administration Working Knowledge of the Following: Expertise in Linux system administration (RedHat + Ubuntu) Expertise … with Windows system administration Expertise in virtualization and containerization (e.g., VMware, Docker, Kubernetes, Podman) Experience with running/maintaining databases/data stores (e.g., MySQL, InfluxDB, Elasticsearch) And These Skills are a Bonus: Prior experience with IT system security compliance (NIST, PCI, HIPPA, CMMC) MacOS system administration Working knowledge of DevOps tools and pipelines Computer networking Amazon Web Services (AWS More ❯
cybersecurity compliance standards Conduct unit testing, debugging, and peer code reviews to ensure software quality Participate in Agile development workflows and deploy software using CI/CD pipelines and Kubernetes orchestration Create and maintain technical documentation to support traceability and audit readiness Collaborate with cybersecurity, systems engineering, and program management teams to deliver compliant, mission-ready software Required Qualifications Bachelor … software development lifecycle (SDLC) Strong attention to detail, ability to collaborate across disciplines, and commitment to secure development Preferred Qualifications Experience deploying Java applications to WebLogic servers Familiarity with Kubernetes, Helm, and Helm Charts Exposure to DoD environments, Agile ceremonies, or DevSecOps workflows Knowledge or experience in Artificial Intelligence (AI) and Machine Learning (ML) technologies If this sounds like you More ❯
Nottingham, Nottinghamshire, East Midlands, United Kingdom Hybrid / WFH Options
Rebel Recruitment
tool for the job might not be the latest buzzword- you know that sometimes you cant beat the tried and tested methods! Tech-wise, you cant get enough of Kubernetes, you are a big Linux fan, and have been using a host of Cloud and DevOps tools like Helm, Ansible, and AWS/GCP/OCI/Azure etc, for … product and, further down the line, theres scope for you to get involved in R&D and Proof of Concept (POC) projects. Youll be using the tech mentioned above- Kubernetes, Helm, Ansible, AWS/Azure/GCP/OCI, etc. Whilst youll use AWS and potentially OCI/GCP/Azure for some parts of your role, youll also be More ❯
Responsibilities Design, develop, and maintain scalable, event-driven microservices architecture Build and optimise RESTful APIs following best practices and design patterns Own the deployment and operation of services on Kubernetes Ensure reliability, security, and performance through testing, monitoring, and optimisation Collaborate closely with AI engineers to integrate LLM/ML solutions into backend services Contribute to system architecture decisions and … Core (NestJS + TypeScript in our stack) Experience with microservices architecture, including event-driven systems Deep knowledge of RESTful API design and implementation Proficiency with containerisation and orchestration using Kubernetes (we use GKE) Database experience across paradigms (MongoDB in our stack) Strong understanding of authentication/authorization and secure coding practices (Auth0 in our stack) Solid experience with automated testing More ❯