diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Annapurna
diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and More ❯
london, south east england, united kingdom Hybrid / WFH Options
Annapurna
diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Annapurna
diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Annapurna
diverse vendor platforms. Working with low-level system and memory management techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … systems. Essential Requirements: 3+ years of experience in C++ programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and More ❯
in GPU kernel development and optimization for AI/HPC applications. Strong technical and analytical skills in GPU computing, hardware architecture, and deep understanding of HIP/CUDA/OpenCL/Triton development. Ability to work as part of a team, deliver to project scope, and communicate effectively to both technical and non-technical audiences. KEY RESPONSIBILITIES: Develop high … AI operator performance (GEMM, Attention, Distributed scale-up/out communication, etc.). Apply your knowledge of software engineering best practices. PREFERRED EXPERIENCE: Knowledge of GPU computing (HIP, CUDA, OpenCL, Triton). Experience in optimizing GPU kernels. Proficiency with profiling and debugging tools. Core understanding of GPU hardware. Excellent C/C Python programming and software design skills, including More ❯
but not essential. Keywords: Compiler/Compilation/LLVM/GCC/OpenSource/Linux/C/C++/Low level/Hardware/debuggers/Fortran/OpenCL/CUDA/MLIR/Machine Learning/GPU/GPGPU By applying to this role you understand that we may collect your personal data and store and process More ❯
but not essential. Keywords: Compiler/Compilation/LLVM/GCC/OpenSource/Linux/C/C++/Low level/Hardware/debuggers/Fortran/OpenCL/CUDA/MLIR/Machine Learning/GPU/GPGPU By applying to this role you understand that we may collect your personal data and store and process More ❯
edge technology focused on delivering high-performance and energy-efficient compute platforms for modern AI workloads. You'll be working on a flagship GPU and AI platform supporting PyTorch, OpenCL, and Vulkan, designed to bring scalable, efficient AI capabilities to developers and researchers across the industry . Role Overview: As a Software Engineer - AI Framework, you'll be responsible More ❯
City of London, London, United Kingdom Hybrid / WFH Options
European Tech Recruit
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
South East London, England, United Kingdom Hybrid / WFH Options
European Tech Recruit
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
london, south east england, united kingdom Hybrid / WFH Options
European Tech Recruit
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
slough, south east england, united kingdom Hybrid / WFH Options
European Tech Recruit
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
European Tech Recruit
experience in deploying SLAM solutions. Proficiency in C++. Desirable experience: PhD in computer vision or robotics. Experience with machine learning techniques for geometric & semantic estimation. GPU programming skills (CUDA, OpenCL, Vulkan, Metal). Experience with embedded software development. If this role is of any interest please apply directly on LinkedIn or send a copy of your CV to nh More ❯
calculation, compilation, algorithm and chip co-design, runtime, or shared memory Strong background in software development using C/C++ and Python Skilled with GPU compute APIs (e.g., CUDA, OpenCL), deep learning frameworks, and compilers Familiarity with AI models, algorithm trends, and translating application requirements into chip-level solutions Experience with GPU acceleration, inference backends, and frameworks such as More ❯
calculation, compilation, algorithm and chip co-design, runtime, or shared memory Strong background in software development using C/C++ and Python Skilled with GPU compute APIs (e.g., CUDA, OpenCL), deep learning frameworks, and compilers Familiarity with AI models, algorithm trends, and translating application requirements into chip-level solutions Experience with GPU acceleration, inference backends, and frameworks such as More ❯
calculation, compilation, algorithm and chip co-design, runtime, or shared memory Strong background in software development using C/C++ and Python Skilled with GPU compute APIs (e.g., CUDA, OpenCL), deep learning frameworks, and compilers Familiarity with AI models, algorithm trends, and translating application requirements into chip-level solutions Experience with GPU acceleration, inference backends, and frameworks such as More ❯
Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and multi More ❯
Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and multi More ❯
Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and multi More ❯
london (city of london), south east england, united kingdom
Hunter Bond
Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and multi More ❯
Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and multi More ❯
London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
high-impact software features. You work collaboratively, thrive in ambiguity, and take full ownership of what you build. Key technical skills Working knowledge of C++ and GPU computing (CUDA, OpenCL) Proven ability to design, build, and maintain robust APIs Proficiency with cloud platforms (e.g. AWS, GCP, or Azure), containerisation, and CI/CD pipelines Familiarity with scalable data delivery More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
high-impact software features. You work collaboratively, thrive in ambiguity, and take full ownership of what you build. Key technical skills Working knowledge of C++ and GPU computing (CUDA, OpenCL) Proven ability to design, build, and maintain robust APIs Proficiency with cloud platforms (e.g. AWS, GCP, or Azure), containerisation, and CI/CD pipelines Familiarity with scalable data delivery More ❯