South East London, England, United Kingdom Hybrid / WFH Options
Techfellow Limited
as-code mindset Hands-on experience resolving GPU workload issues across compute clusters and supporting technologies Familiarity with performance tooling and debugging in live production environments Practical experience with CUDA or systems-level programming in C/C++ Experience with config management frameworks like Salt, Ansible, or Puppet (Preferred) Experience with GPU communication and interconnect technologies (e.g. collective communication More ❯
Contribute to hiring additional talent to our rapidly growing team The role will be exposed to a broad tech stack (e.g. ReactJS, Python, REST & GraphQL, OpenCV, PyTorch, GCP, AWS & CUDA, Kubernetes) and the cutting edge of computer vision and deep learning. Qualifications The right candidate will have a proven track record of relevant publications and previous experience managing applied More ❯
of model architectures like transformers and CNNs. Hands-on experience with model optimization (i.e. quantization, pruning) and model deployment frameworks such as TensorRT, ONNX Runtime, and OpenVINO. Proficiency with CUDA programming and optimizing code for GPU acceleration. Strong background in MLOps practices, including CI/CD using GitHub Actions and containerization with Docker. Excellent problem-solving skills and the More ❯
of model architectures like transformers and CNNs. Hands-on experience with model optimization (i.e. quantization, pruning) and model deployment frameworks such as TensorRT, ONNX Runtime, and OpenVINO. Proficiency with CUDA programming and optimizing code for GPU acceleration. Strong background in MLOps practices, including CI/CD using GitHub Actions and containerization with Docker. Excellent problem-solving skills and the More ❯
using Docker and orchestrating them with Kubernetes for scalable model serving. Optimizing the performance of our Ultralytics YOLO11 models for various deployment targets, from high-performance cloud GPUs with CUDA to edge devices using frameworks like TensorRT and OpenVINO. Implementing robust systems for model monitoring and maintenance to track performance and detect data drift. Collaborating closely with our AI … experience with at least one major cloud provider ( GCP , Azure, AWS). Experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible. Familiarity with GPU acceleration using CUDA and model optimization for inference. Knowledge of MLOps tools for experiment tracking, and model serving such as MLflow, Kubeflow, or Weights & Biases. Excellent problem-solving skills and the ability More ❯
using Docker and orchestrating them with Kubernetes for scalable model serving. Optimizing the performance of our Ultralytics YOLO11 models for various deployment targets, from high-performance cloud GPUs with CUDA to edge devices using frameworks like TensorRT and OpenVINO. Implementing robust systems for model monitoring and maintenance to track performance and detect data drift. Collaborating closely with our AI … experience with at least one major cloud provider ( GCP , Azure, AWS). Experience with Infrastructure as Code (IaC) tools such as Terraform or Ansible. Familiarity with GPU acceleration using CUDA and model optimization for inference. Knowledge of MLOps tools for experiment tracking, and model serving such as MLflow, Kubeflow, or Weights & Biases. Excellent problem-solving skills and the ability More ❯
with the latest developments in model optimization, inference engines, quantization methods, and related technologies. Requirements Proven professional experience optimizing neural network inference workloads. Strong expertise with TensorRT, Triton language, CUDA programming. Experience with neural network quantization techniques. Proficiency in Python and PyTorch. Deep understanding of GPU architectures and performance optimization. Excellent problem-solving skills and ability to analyze performance More ❯
/CD pipelines using GitHub Actions . Experience with analytics platforms like Google Analytics and business intelligence tools like Tableau or Power BI. Knowledge of GPU-accelerated computing with CUDA is highly desirable. Excellent problem-solving skills and the ability to thrive in a fast-paced, high-intensity startup environment. 🌟 Cultural Fit - Intensity Required Ultralytics is a high-performance More ❯
/CD pipelines using GitHub Actions . Experience with analytics platforms like Google Analytics and business intelligence tools like Tableau or Power BI. Knowledge of GPU-accelerated computing with CUDA is highly desirable. Excellent problem-solving skills and the ability to thrive in a fast-paced, high-intensity startup environment. 🌟 Cultural Fit - Intensity Required Ultralytics is a high-performance More ❯
that bear little resemblance to publicly available substitutes. Utmost integrity, confidentiality, and discretion in both internal and external interactions. What We Value Experience writing and optimizing compute kernels with CUDA or similar languages. History of developing creative approaches to drive high ML accuracy within an alloted computational budget. Competitive Compensation. We provide financial peace of mind with competitive base More ❯