Agile events and activities. Team Technologies used include Python, Conda, Behavior Driven Development (PyTest-BDD, Cucumber), Gherkin, Ubuntu, Docker, Jenkins, Bash, Groovy, C++/CUDA, JIRA, and Github. Work schedule is flexible, but some intersection with team members in different timezones will be required (two regular meetings per week More ❯
another engineering field. Examples include nonlinear estimation, numerical simulation, nonlinear optimization, and control theory. Experience in the following would be beneficial but not mandatory: CUDA C/C++ GPU computing High performance computing Scientific computing Natural language processing Computer vision Compensation and Benefits: Base Salary Range More ❯
MLIR, Triton, etc.). Expertise in tailoring algorithms and ML models to exploit GPU strengths and minimize weaknesses. Knowledge of low-level GPU programming (CUDA, OpenCL, etc.) and performance tuning techniques. Understanding of modern GPU architectures, memory hierarchies, and performance bottlenecks. Ability to develop and utilize sophisticated performance models More ❯
Platforms (AWS, Azure, GCP) for model deployment and scaling. Understanding of Edge Computing and on-device model optimization (TensorRT, ONNX). Knowledge of NVIDIACUDA for GPU acceleration. WANDB and MLflow for training monitoring. Contributions to open-source computer vision or deep learning projects. What We Offer Competitive Compensation More ❯
Platforms (AWS, Azure, GCP) for model deployment and scaling. Understanding of Edge Computing and on-device model optimization (TensorRT, ONNX). Knowledge of NVIDIACUDA for GPU acceleration. WANDB and MLflow for training monitoring. Contributions to open-source computer vision or deep learning projects. What We Offer Competitive Compensation More ❯
medical device development Technical Expertise: Experience with multi-tasking systems (real-time preferable) and familiarity with signal processing or AI/ML applications using CUDA on GPUs (preferred), medical device communications protocols (HL7, FHIR) Development Approach: Knowledge of agile methodologies and best practices in software development Tools & Practices: Proficiency More ❯
complex machine learning algorithms into scalable, production-quality code, with proficiency in Python and a strong understanding of optimization techniques (experience with Cython and CUDA is a plus). Experience in developing Large Language Models (LLMs) is advantageous. In-depth understanding of computer architecture and its implications on AI More ❯
and collaboration skills to work effectively in multidisciplinary teams.• Knowledge of ethical AI practices and laws is a plus. Preferred Skills: Knowledge of NVIDIACUDA, cuDNN, TensorRT and Experience with NVIDIA GPU hardware and software stack Understanding of HPC and AI workloads. Familiarity with BigData platforms and technologies, such More ❯
and collaboration skills to work effectively in multidisciplinary teams.• Knowledge of ethical AI practices and laws is a plus. Preferred Skills: Knowledge of NVIDIACUDA, cuDNN, TensorRT and Experience with NVIDIA GPU hardware and software stack Understanding of HPC and AI workloads. Familiarity with BigData platforms and technologies, such More ❯
a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge More ❯
a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge More ❯
to 5 years' experience building technical or scientific software Bonus if you have understanding of: CFD, meshing, parallel computing, GPU's such as CUDA, and or CAD More ❯
to 5 years' experience building technical or scientific software Bonus if you have understanding of: CFD, meshing, parallel computing, GPU's such as CUDA, and or CAD More ❯
ML frameworks. Experience optimizing deep learning performance on accelerator hardware. Solid knowledge of deep learning algorithms and compute patterns. Strong programming skills in C++, CUDA, or OpenCL. Background in performance profiling and optimization. BS/MS in Computer Science, Electrical Engineering, or a related field. Interested? Send your CV More ❯
Background: Experience in highly regulated industries, preferably in medical device development. Technical Expertise: Experience with multi-tasking systems, Linux and RTOS, FPGAs, micro-controllers, CUDA, communication protocols (e.g. I2C, SPI, UART, USB, Ethernet, PCIe), driver development and familiarity with signal processing using GPU (preferred). Development Approach: Knowledge of More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Enigma
AI/ML in the sports domain to create insights or data. Advanced systems knowledge, such as: Developing GPU kernels or ML compilers (e.g., CUDA, OpenCL, TensorRT Plugins, MLIR, TVM ). System optimization for latency and utilization , using tools like Nvidia NSight . Working with embedded SoCs (e.g., NvidiaMore ❯
Metal developers to tune their applications for maximum performance on Apple Silicon. Minimum Qualifications Understand the graphics pipeline GPU programming with Metal, DirectX, Vulkan, CUDA, Direct Compute, OpenGL, or OpenCL Programming knowledge of C/C++ Carry forward highly complex software debug efforts Preferred Qualifications Excellent written and oral More ❯
on low-precision arithmetic, deep learning models including large generative models for language, vision and other modalities . Experience writing C++/Triton/CUDA kernels for performance optimisation of ML models. Have contributed to open-source projects or published research papers in relevant fields. Knowledge of cloud computing More ❯
detailed breakdown of all the technologies we use: Backend: Python Frontend: Typescript and React Kubernetes for deployment GCP for underlying infrastructure Machine Learning: PyTorch, CUDA, Ray We encourage people from all backgrounds, cultures, and skill levels to apply. It is okay to not meet all requirements listed as we More ❯
South West London, London, United Kingdom Hybrid / WFH Options
La Fosse
Sports tech experience: Background applying AI/ML in the sports domain for data generation or insights. Systems optimisation: Knowledge of GPU kernel development (CUDA, OpenCL, etc.), real-time system optimisation (e.g., Nvidia NSight), or experience working with embedded SoCs (Nvidia, Qualcomm, etc.). If you're interested in More ❯
the boundaries of model performance. You'll also work on re-implementing models in an efficient manner by using PyTorch and underlying technologies like Cuda Kernels, Torch compilation techniques. This would include: Evaluating and optimising compute resource usage (e.g., Hopper GPUs) for cost and time efficiency at training and More ❯
Hands-on with monitoring tools like Prometheus and Grafana Nice to have: Experience building Developer Experience (DevX) tools and workflows Familiarity with GPU setups (CUDA, TensorFlow, etc.) Strong networking and network security knowledge Linux/Unix skills and shell scripting A degree in Computer Science or a related field More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Velocity Tech
Hands-on with monitoring tools like Prometheus and Grafana Nice to have: Experience building Developer Experience (DevX) tools and workflows Familiarity with GPU setups (CUDA, TensorFlow, etc.) Strong networking and network security knowledge Linux/Unix skills and shell scripting A degree in Computer Science or a related field More ❯
the boundaries of model performance. You'll also work on re-implementing models in an efficient manner by using PyTorch and underlying technologies like Cuda Kernels, Torch compilation techniques. This would include: Evaluating and optimising compute resource usage (e.g., Hopper GPUs) for cost and time efficiency at training and More ❯
decoding, and transmission at scale (e.g. HLS, WebRTC, and FFMPEG). Accelerator experience. You've developed GPU kernels and/or ML compilers (e.g., CUDA, OpenCL, TensorRT Plugins, MLIR, TVM, etc). Real-time experience. You've optimized systems to meet strict utilization and latency requirements with tools such More ❯