JIRA, Git, Jenkins, JAVA, bash, batch files, TestRail. • 2D and 3D Geometrical modelling experience; Geometrical APIs or toolkits including CGAL. • Multithreading and parallel programming experience; OpenMP; GPU programming using CUDA or OpenCL. • Scripting of mathematical or geological problems; Excel, MATLAB, Python. Knowledge of any/several of the following will be ideal: • Seismic processing and attribute analysis. • Modelling of More ❯
London, England, United Kingdom Hybrid / WFH Options
ECM Selection
Qt, QML); 3D graphics toolkits (OpenGL, Vulkan or shaders); CI experience (CMake, JIRA, Git, Jenkins); GIS development tools (GDAL API, MapBox API); multithreading/parallel computing (GPU programming or CUDA); MATLAB/Python scripting for mathematical/geology problems would be advantageous. Due to specific requirements, applicants without the relevant project experiences will not be considered (similarly exposure to More ❯
and Qualifications Required Skills: Bachelor's or higher degree in Computer Science/Engineering or related disciplines. Professional software development experience with modern C++. Experience with GPU compute in CUDA/OpenCL. Knowledge of linear algebra equivalent to at least first-year university level. Strong computer science and engineering fundamentals (e.g., OS, Compiler). Familiarity with software engineering practices More ❯
and Qualifications Required Skills Bachelor's or higher degree in Computer Science/Engineering or related disciplines Professional software development experience with modern C++ Experience with GPU compute in CUDA/OpenCL Knowledge of linear algebra equivalent to at least first-year university level Strong computer science and engineering fundamentals (e.g., OS, Compiler) Familiarity with software engineering practices and More ❯
London, England, United Kingdom Hybrid / WFH Options
PhysicsX Ltd
scaling and optimising ML models, training and serving foundation models at scale (federated learning a bonus); distributed computing frameworks (e.g., Spark, Dask) and high-performance computing frameworks (MPI, OpenMP, CUDA, Triton); cloud computing (on hyper-scaler platforms, e.g., AWS, Azure, GCP); building machine learning models and pipelines in Python, using common libraries and frameworks (e.g., NumPy, SciPy, Pandas, PyTorch More ❯
London, England, United Kingdom Hybrid / WFH Options
NVIDIA
/Digital Twins. Proficiency in deploying AI models and optimizing inference using TensorRT, ONNX Runtime, Triton, or TensorRT-LLM is a plus. Proven experience implementing and optimizing workloads with CUDA and Nsight Tools. Experience with high performance networking technologies, e.g. DPDK, DOCA, RMDA, RoCEv2 is a plus. Published record of thought leadership in a technical area or industry segment. More ❯
Experience: Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and More ❯
Hands-on experience design low-latency, high-performance, real-time video or image processing software • Experience developing or implementing real-time image processing algorithms using hardware acceleration • Experience with CUDA or OpenCL • Experience with TensorRT, Triton, or equivalent AI acceleration/inferencing frameworks • Ability to write clear, maintainable and well-documented code • Capability to work independently, driving development from More ❯
development Experience with software development processes and tools such as Git source code control, profiler, and debugger Effective communication and problem-solving skills Experience with compute languages like HIP, CUDA, OpenCL is a plus ACADEMIC CREDENTIALS: Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent #LI-RA1 #LI-Remote Benefits offered are described More ❯
at a leading technology company. Strong expertise in algorithms, data structures, multivariate calculus, and linear algebra. Proficient in Python, TensorFlow, PyTorch, or similar languages and frameworks, with experience writing CUDA kernels and profiling GPU code a plus. Excellent communication skills, with the ability to work effectively in cross-functional teams and present complex ideas to both technical and non More ❯
systems for GPU architectures (OpenXLA, MLIR, Triton, etc.). Expertise in tailoring algorithms and ML models to exploit GPU strengths and minimize weaknesses. Knowledge of low-level GPU programming (CUDA, OpenCL, etc.) and performance tuning techniques. Understanding of modern GPU architectures, memory hierarchies, and performance bottlenecks. Ability to develop and utilize sophisticated performance models and benchmarks to guide optimization More ❯
e.g., Unreal Engine, Unity, custom 3D engines). Proven track record of publications at top-tier conferences (e.g., NeurIPS, CVPR, ICML, ICLR, SIGGRAPH, ECCV). Experience with GPU programming (CUDA) and model optimization for real-time inference (e.g., quantization, pruning, ONNX, TensorRT, custom CUDA kernels). Background in scalable algorithm design for real-time or interactive applications. Experience More ❯
Skills: Proficiency in C/C++ and Python Technical Expertise: Experience with multi-tasking systems (real-time preferable) and familiarity with signal processing or AI/ML applications using CUDA on GPUs (preferred), medical device communications protocols (HL7, FHIR) Development Approach: Knowledge of agile methodologies and best practices in software development Tools & Practices: Proficiency with version control systems (e.g. More ❯
highly regulated industries, preferably in medical device development Technical Expertise: Experience with multi-tasking systems (real-time preferable) and familiarity with signal processing or AI/ML applications using CUDA on GPUs (preferred), medical device communications protocols (HL7, FHIR) Development Approach: Knowledge of agile methodologies and best practices in software development Tools & Practices: Proficiency with version control systems (e.g. More ❯
at a leading technology company. Strong expertise in algorithms, data structures, multivariate calculus, and linear algebra. Proficient in Python, TensorFlow, PyTorch, or similar languages and frameworks, with experience writing CUDA kernels and profiling GPU code a plus. Excellent communication skills, with the ability to work effectively in cross-functional teams and present complex ideas to both technical and non More ❯
at a leading technology company. Strong expertise in algorithms, data structures, multivariate calculus, and linear algebra. Proficient in Python, TensorFlow, PyTorch, or similar languages and frameworks, with experience writing CUDA kernels and profiling GPU code a plus. Excellent communication skills, with the ability to work effectively in cross-functional teams and present complex ideas to both technical and non More ❯
of the mathematical foundations of deep learning, including multivariate calculus, linear algebra, and optimization techniques. Proficient in Python and deep learning frameworks such as TensorFlow and PyTorch. Experience with CUDA kernels and GPU profiling is a plus. Excellent communication skills, with the ability to present complex technical ideas to both technical and non-technical audiences. Knowledge of quantitative finance More ❯
of the mathematical foundations of deep learning, including multivariate calculus, linear algebra, and optimization techniques. Proficient in Python and deep learning frameworks such as TensorFlow and PyTorch. Experience with CUDA kernels and GPU profiling is a plus. Excellent communication skills, with the ability to present complex technical ideas to both technical and non-technical audiences. Knowledge of quantitative finance More ❯
language, vision and other modalities, machine learning for molecules and proteins (ideally with some background in chemistry and biological sciences) . Lower-level programming for hardware efficiency, e.g. C CUDA/Triton. Practical familiarity with hardware capabilities for deep learning - threads, caches, vector & matrix engines, data dependencies, bus widths and throttling. Practical familiarity with software stacks for deep learning More ❯
sound engineering principles to ensure robust, maintainable solutions. PREFERRED EXPERIENCE: GPU Kernel Development & Optimization: Experienced in designing and optimizing GPU kernels for deep learning on AMD GPUs using HIP, CUDA, and assembly (ASM). Strong knowledge of AMD architectures (GCN, RDNA) and low-level programming to maximize performance for AI operations, leveraging tools like Compute Kernel (CK), CUTLASS, and More ❯
industry experience. Expertise in translating complex machine learning algorithms into scalable, production-quality code, with proficiency in Python and a strong understanding of optimization techniques (experience with Cython and CUDA is a plus). Experience in developing Large Language Models (LLMs) is advantageous. In-depth understanding of computer architecture and its implications on AI/ML performance. Comprehensive knowledge More ❯
London, England, United Kingdom Hybrid / WFH Options
Treecode
above in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Desirable Strong software engineering experience in Python and other relevant languages (e.g. C++ and CUDA) Direct experience working in at least one of computer vision, robotics, simulation, graphics, or large language models. MS, or above in Machine Learning, Computer Science, Engineering, or a related More ❯
London, England, United Kingdom Hybrid / WFH Options
Wayve
BSc above in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience Strong software engineering experience in Python and other relevant languages (e.g. C++ and CUDA) Direct experience working in at least one of computer vision, robotics, simulation, graphics, or large language models. MS, or above in Machine Learning, Computer Science, Engineering, or a related More ❯
London, England, United Kingdom Hybrid / WFH Options
InstaDeep Ltd
optimise state-of-the-art algorithms and architectures, ensuring compute efficiency and performance. Low-Level Mastery: Write high-quality Python, C/C++, XLA, Pallas, Triton, and/or CUDA code to achieve performance breakthroughs. Required Skills Understanding of Linux systems, performance analysis tools, and hardware optimisation techniques Experience with distributed training frameworks (Ray, Dask, PyTorch Lightning, etc.) Expertise … with machine learning frameworks (JAX, Tensorflow, PyTorch etc.) Passion for profiling, identifying bottlenecks, and delivering efficient solutions. Highly Desirable Track record of successfully scaling ML models. Experience writing custom CUDA kernels or XLA operations. Understanding of GPU/TPU architectures and their implications for efficient ML systems. Fundamentals of modern Deep Learning Actively following ML trends and a desire … to push boundaries. Example Projects: Profile algorithm traces, identifying opportunities for custom XLA operations and CUDA kernel development. Implement and apply SOTA architectures (MAMBA, Griffin, Hyena) to research and applied projects. Adapt algorithms for large-scale distributed architectures across HPC clusters. Employ memory-efficient techniques within models for increased parameter counts and longer context lengths. What We Offer: Real More ❯
Experience: Proficiency in C++ with a strong focus on memory management, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallel computing, vectorization, and More ❯