U.S.) or 25 miles (non-U.S., country-specific) of that location. This expectation is subject to local law and may vary by jurisdiction. Responsibilities Design and develop Python and CUDA/HIP C++ code that enable distributed training of multimodal LLMs ingesting text, audio, images, or video data. Build and maintain cutting-edge infrastructure that can store and process More ❯
Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience with GPU orchestration (NVIDIA A100/H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration More ❯
Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience with GPU orchestration (NVIDIA A100/H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration More ❯
london (city of london), south east england, united kingdom
83data
Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience with GPU orchestration (NVIDIA A100/H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration More ❯
Inference experience for high-throughput model serving - Proven ability to work on air-gapped systems with no external package repositories - Experience with GPU orchestration (NVIDIA A100/H100) and CUDA optimisation - Python expertise with offline dependency management and local package mirrors Technical Stack (All On-Premises) Models: Llama 3, Mistral, Qwen (locally hosted) Vector Stores: Chroma, FAISS, Milvus Orchestration More ❯
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. Experience configuring and maintaining cloud infrastructure including network infrastructure, compute More ❯
london (city of london), south east england, united kingdom
Safe Intelligence
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. Experience configuring and maintaining cloud infrastructure including network infrastructure, compute More ❯
modular code delivery in Docker -based environments. Desirable Experience Experience with PyTorch for AI-based perception/control. Familiarity with MoveIt for motion planning in ROS2 . Knowledge of CUDA for C++ real-time optimisation. To Apply: Please email your CV More ❯
modular code delivery in Docker -based environments. Desirable Experience Experience with PyTorch for AI-based perception/control. Familiarity with MoveIt for motion planning in ROS2 . Knowledge of CUDA for C++ real-time optimisation. To Apply: Please email your CV Desired Skills and Experience Python: Advanced proficiency in Python, leveraging scientific and numerical libraries (e.g., NumPy, SciPy) for More ❯
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. At a personal level we’re also looking for someone More ❯
City of London, Greater London, UK Hybrid / WFH Options
Safe Intelligence
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. At a personal level we’re also looking for someone More ❯
london, south east england, united kingdom Hybrid / WFH Options
Safe Intelligence
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. At a personal level we’re also looking for someone More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Safe Intelligence
design patterns. Experience in data science tools and ML tools (e.g., NumPy, pandas, scikit-learn, PyTorch) and open-source contributions (especially Python-based) would be a bonus. Familiarity with CUDA, GPU-based computations, end-to-end neural network training, MLOps, and academic research in machine learning are also beneficial. At a personal level we’re also looking for someone More ❯
optimise state-of-the-art algorithms and architectures, ensuring compute efficiency and performance. Low-Level Mastery: Write high-quality Python, C/C++, XLA, Pallas, Triton, and/or CUDA code to achieve performance breakthroughs. Required Skills Understanding of Linux systems, performance analysis tools, and hardware optimisation techniques Experience with distributed training frameworks (Ray, Dask, PyTorch Lightning, etc.) Expertise … with machine learning frameworks (JAX, Tensorflow, PyTorch etc.) Passion for profiling, identifying bottlenecks, and delivering efficient solutions. Highly Desirable Track record of successfully scaling ML models. Experience writing custom CUDA kernels or XLA operations. Understanding of GPU/TPU architectures and their implications for efficient ML systems. Fundamentals of modern Deep Learning Actively following ML trends and a desire … to push boundaries. Example Projects: Profile algorithm traces, identifying opportunities for custom XLA operations and CUDA kernel development. Implement and apply SOTA architectures (MAMBA, Griffin, Hyena) to research and applied projects. Adapt algorithms for large-scale distributed architectures across HPC clusters. Employ memory-efficient techniques within models for increased parameter counts and longer context lengths. What We Offer: Real More ❯
Greater London, England, United Kingdom Hybrid / WFH Options
microTECH Global LTD
to deploy GPU and ML workloads at scale. Provision and optimise GPU cloud infrastructure (AWS, GCP, Azure) using Terraform/Ansible. Collaborate with GPU engineers and researchers to integrate CUDA, SYCL, Vulkan, and ML kernels into production workflows. Support secure packaging, deployment, and distribution of GPU-accelerated software to partners and clients. Evolve infrastructure to support hybrid AI/… GitLab CI, etc.). Proficiency in containerisation and orchestration (Docker, Kubernetes). Experience with cloud GPU infrastructure (AWS, Azure, GCP) and IaC (Terraform, Ansible). Familiarity with GPU workflows (CUDA, SYCL, Vulkan, OpenCL) or HPC performance optimisation. Strong scripting and programming skills (Python, Bash, C/C++ exposure a plus). Knowledge of monitoring, logging, and performance testing for More ❯
london, south east england, united kingdom Hybrid / WFH Options
microTECH Global LTD
to deploy GPU and ML workloads at scale. Provision and optimise GPU cloud infrastructure (AWS, GCP, Azure) using Terraform/Ansible. Collaborate with GPU engineers and researchers to integrate CUDA, SYCL, Vulkan, and ML kernels into production workflows. Support secure packaging, deployment, and distribution of GPU-accelerated software to partners and clients. Evolve infrastructure to support hybrid AI/… GitLab CI, etc.). Proficiency in containerisation and orchestration (Docker, Kubernetes). Experience with cloud GPU infrastructure (AWS, Azure, GCP) and IaC (Terraform, Ansible). Familiarity with GPU workflows (CUDA, SYCL, Vulkan, OpenCL) or HPC performance optimisation. Strong scripting and programming skills (Python, Bash, C/C++ exposure a plus). Knowledge of monitoring, logging, and performance testing for More ❯