e.g., Unreal Engine, Unity, custom 3D engines). Proven track record of publications at top-tier conferences (e.g., NeurIPS, CVPR, ICML, ICLR, SIGGRAPH, ECCV). Experience with GPU programming (CUDA) and model optimization for real-time inference (e.g., quantization, pruning, ONNX, TensorRT, custom CUDA kernels). Background in scalable algorithm design for real-time or interactive applications. Experience More ❯
e.g., Unreal Engine, Unity, custom 3D engines). Proven track record of publications at top-tier conferences (e.g., NeurIPS, CVPR, ICML, ICLR, SIGGRAPH, ECCV). Experience with GPU programming (CUDA) and model optimization for real-time inference (e.g., quantization, pruning, ONNX, TensorRT, custom CUDA kernels). Background in scalable algorithm design for real-time or interactive applications. Experience More ❯
e.g., Unreal Engine, Unity, custom 3D engines). Proven track record of publications at top-tier conferences (e.g., NeurIPS, CVPR, ICML, ICLR, SIGGRAPH, ECCV). Experience with GPU programming (CUDA) and model optimization for real-time inference (e.g., quantization, pruning, ONNX, TensorRT, custom CUDA kernels). Background in scalable algorithm design for real-time or interactive applications. Experience More ❯
e.g., Unreal Engine, Unity, custom 3D engines). Proven track record of publications at top-tier conferences (e.g., NeurIPS, CVPR, ICML, ICLR, SIGGRAPH, ECCV). Experience with GPU programming (CUDA) and model optimization for real-time inference (e.g., quantization, pruning, ONNX, TensorRT, custom CUDA kernels). Background in scalable algorithm design for real-time or interactive applications. Experience More ❯
code on Linux or embedded platforms. Demonstrated ability to deliver production quality, well tested code in collaborative, fast moving environments. Preferred Qualifications Familiarity with GPU or edge AI acceleration (CUDA, TensorRT, Vulkan, or similar). Experience deploying perception pipelines on resource constrained hardware. Publications in multimodal sensing/neural representations/SLAM for robotics or autonomous navigation in journals More ❯
validation processes, libraries, and optimization frameworks • Experience with CI/CD pipelines Nice to Haves • Knowledge of AWS cloud services (EKS, ECS, etc.) and Docker • Knowledge of C++ and CUDA, with experience writing custom kernels/operators • Proficiency in distributed AI training and parallel processing • Familiarity with recent Deep Learning and Computer Vision architectures • Data visualization skills • Excellent communication More ❯
Creative, communicative and team working Experience in weather and climate science industries Preferred Skills Man-Machine interactions, User Interface, Ergonomic and Graphical approach to UI Computational Fluid Dynamics (CFD) Cuda, kokkos parallel programming Pay range and compensation package We offer exciting and dynamic salary packages that include stock options and a variety of social benefits. Starts now and be More ❯
language, vision and other modalities, machine learning for molecules and proteins (ideally with some background in chemistry and biological sciences) . Lower-level programming for hardware efficiency, e.g. C CUDA/Triton. Practical familiarity with hardware capabilities for deep learning - threads, caches, vector & matrix engines, data dependencies, bus widths and throttling. Practical familiarity with software stacks for deep learning More ❯
language, vision and other modalities, machine learning for molecules and proteins (ideally with some background in chemistry and biological sciences). Lower-level programming for hardware efficiency, e.g. C CUDA/Triton. Practical familiarity with hardware capabilities for deep learning - threads, caches, vector & matrix engines, data dependencies, bus widths and throttling. Practical familiarity with software stacks for deep learning More ❯
sound engineering principles to ensure robust, maintainable solutions. PREFERRED EXPERIENCE: GPU Kernel Development & Optimization: Experienced in designing and optimizing GPU kernels for deep learning on AMD GPUs using HIP, CUDA, and assembly (ASM). Strong knowledge of AMD architectures (GCN, RDNA) and low-level programming to maximize performance for AI operations, leveraging tools like Compute Kernel (CK), CUTLASS, and More ❯
sound engineering principles to ensure robust, maintainable solutions. PREFERRED EXPERIENCE: GPU Kernel Development & Optimization: Experienced in designing and optimizing GPU kernels for deep learning on AMD GPUs using HIP, CUDA, and assembly (ASM). Strong knowledge of AMD architectures (GCN, RDNA) and low-level programming to maximize performance for AI operations, leveraging tools like Compute Kernel (CK), CUTLASS, and More ❯
models, building production systems with large language models, efficient computing with low-precision arithmetic, or large generative models for language, vision, and other modalities. Experience writing C++, Triton, or CUDA kernels for performance optimisation of ML models. Contributions to open-source projects or published research papers in relevant fields. Knowledge of cloud computing platforms. Keen to present, publish, and More ❯
United Kingdom Recommended Qualifications Familiarity with GIS applications and technologies Cross-platform development, profiling, and debugging Understanding of scientific, spatial and graphics algorithms and software design patterns Experience with CUDA, Direct3D, Metal, OpenGL, Vulkan, WebGL, or WebGPU, and compute shader programming Experience with agile development methodologies (such as Scrum) Postgraduate degree in Computer Science or related STEM field Base More ❯
London, England, United Kingdom Hybrid / WFH Options
Treecode
above in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Desirable Strong software engineering experience in Python and other relevant languages (e.g. C++ and CUDA) Direct experience working in at least one of computer vision, robotics, simulation, graphics, or large language models. MS, or above in Machine Learning, Computer Science, Engineering, or a related More ❯
London, England, United Kingdom Hybrid / WFH Options
Wayve
BSc above in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Strong software engineering experience in Python and other relevant languages (e.g. C++ and CUDA) Direct experience working in at least one of computer vision, robotics, simulation, graphics, or large language models. MS, or above in Machine Learning, Computer Science, Engineering, or a related More ❯
Earl's Court, England, United Kingdom Hybrid / WFH Options
Wayve
BSc above in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Strong software engineering experience in Python and other relevant languages (e.g. C++ and CUDA) Direct experience working in at least one of computer vision, robotics, simulation, graphics, or large language models. MS, or above in Machine Learning, Computer Science, Engineering, or a related More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Annapurna
across diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing … autonomous systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying More ❯
across diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing … autonomous systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying More ❯
analysis to prototype quickly Desirable Experience Experience with TensorRT , Nvidia Deepstream , or other deployment frameworks Background in neural network design or edge inference Programming in C/C++ and CUDA Realtime or embedded vision applications Why Join AssetCool? Tackle some of the toughest challenges in robotics, vision, and infrastructure tech Join a growing team with global ambitions and a More ❯
you will specialize in the design, development, and optimization of GPU kernels and algorithms to support the training and inference of symbolic reasoning models. You will leverage frameworks like CUDA and CUTLASS, along with compiler optimization techniques, to push the boundaries of performance for high-dimensional computation. Your focus Developing and optimizing GPU kernels for high-performance symbolic reasoning … maximum efficiency. About you Strong proficiency in at least one high-performance programming language (C, C++, Rust, Haskell, or Julia) and familiarity with Python. Proficiency in GPU programming with CUDA, including experience with kernel development, compiler optimizations, and performance tuning. In-depth knowledge of GPU architecture, including memory hierarchies, thread blocks, warps, and scheduling. Experience with compiler development, LLVM More ❯
PyTorch internals and other major ML frameworks. Experience optimizing deep learning performance on accelerator hardware. Solid knowledge of deep learning algorithms and compute patterns. Strong programming skills in C++, CUDA, or OpenCL. Background in performance profiling and optimization. BS/MS in Computer Science, Electrical Engineering, or a related field. Interested? Send your CV to to apply. More ❯
numerical calculation, compilation, algorithm and chip co-design, runtime, or shared memory Strong background in software development using C/C++ and Python Skilled with GPU compute APIs (e.g., CUDA, OpenCL), deep learning frameworks, and compilers Familiarity with AI models, algorithm trends, and translating application requirements into chip-level solutions Experience with GPU acceleration, inference backends, and frameworks such More ❯
related projects Experience deploying MLOps solutions and working within CI/CD frameworks Experience with Linux systems and cloud infrastructure (AWS, Azure, etc.) Experience developing embedded ML applications (C++, CUDA, TensorRT) Language Skills: Good level of English, spoken and written Personal Skills: Capability to integrate in and work within a trans-European team Solid organisational, analytical and reporting skills More ❯
as: Puppet, Chef, SMS, Satellite, etc. Knowledge of interpreted and compiled computer programming languages such as Python, Java, C, Objective C, C++, C Sharp, SQL, Tcl, Perl, PHP. Assembly, CUDA, and GPU language experience desirable. Knowledge of advanced computing technologies such as parallel processing, in-memory databases, graph databases and graph theory, machine learning, deep learning, and neural networks. More ❯
Basildon, Essex, United Kingdom Hybrid / WFH Options
leonardo company
looking for: Essential: C# software development Machine-to-machine networking, working to third-party interface definitions Test frameworks and test development (not test-driven development) Microservices architecture/containerisation CUDA integration (AI/ML) Development of new applications to meet user expectations and within formal constraints. HMI/GUI/UX experience needed. Familiarity with the tools and approaches More ❯