CUDA, including experience with kernel development, compiler optimizations, and performance tuning. In-depth knowledge of GPU architecture, including memory hierarchies, thread blocks, warps, and scheduling. Experience with compiler development, LLVM/MLIR, or domain-specific language (DSL) optimizations. Familiarity with tensor operations and matrix multiplications is a plus. Proven optimizing numerical algorithms for high-performance computing environments. Familiarity with More ❯
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
City of London, London, United Kingdom Hybrid / WFH Options
IC Resources
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
london, south east england, united kingdom Hybrid / WFH Options
IC Resources
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
IC Resources
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
slough, south east england, united kingdom Hybrid / WFH Options
IC Resources
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
City of London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
like Ghidra, IDA Pro, or similar Strong understanding of embedded toolchains (compilers, linkers, debuggers) Familiarity with SoC bring-up, device trees, and system-level debugging Knowledge or experience with LLVM or low-level compiler internals is advantageous What next? If you're an Embedded Software Engineer excited by reverse engineering, AI-assisted tooling, and building next-gen infrastructure for More ❯
technologies. Experience in the development or contribution to graphics drivers, demonstrating a strong understanding of shader compilation processes and low-level graphics API interactions. Familiarity with compiler technologies (particularly LLVM) and shader ecosystems, including high-level languages (e.g., HLSL, GLSL) and intermediate representations (e.g., SPIR-V), relevant to driver development or low-level API programming. About the job Google More ❯
role. Experience in the development or contribution to graphics drivers, demonstrating a strong understanding of shader compilation processes and low-level graphics API interactions. Familiarity with compiler technologies (particularly LLVM) and shader ecosystems, including high-level languages (e.g., HLSL, GLSL) and intermediate representations (e.g., SPIR-V), relevant to driver development or low-level API programming. About the job Google More ❯
and grow in both directions. This position will require the candidate to work closely with researchers and engineers to enable and accelerate new research efforts for on-device AI. LLVM experience will be a plus. Role and Responsibilities As an AI Software Engineer, you will: Develop features and functionality across the AI stack – from framework to applications for on … practices and tools such as Git, CI, Agile, package managers, etc. Excellent communication, teamwork, problem-solving skills, and a results-oriented attitude. Desirable Skills: Knowledge of computer vision fundamentals. LLVMcompiler experience. Experience with commercial/production AI. Experience in Python/Java/Kotlin. #J-18808-Ljbffr More ❯
ISA definition and enhancements Benchmark and optimize compiler performance for key workloads Contribute to documentation and developer resources Requirements: 5+ years of experience in compiler development Strong knowledge of LLVM or similar compilerinfrastructure Experience with code generation for vector architectures Understanding of graphics shader compilers and/or AI compiler stacks Familiarity with RISC-V architecture and vector More ❯
ISA definition and enhancements Benchmark and optimize compiler performance for key workloads Contribute to documentation and developer resources Requirements: 5+ years of experience in compiler development Strong knowledge of LLVM or similar compilerinfrastructure Experience with code generation for vector architectures Understanding of graphics shader compilers and/or AI compiler stacks Familiarity with RISC-V architecture and vector More ❯
ISA definition and enhancements Benchmark and optimize compiler performance for key workloads Contribute to documentation and developer resources Requirements: 5+ years of experience in compiler development Strong knowledge of LLVM or similar compilerinfrastructure Experience with code generation for vector architectures Understanding of graphics shader compilers and/or AI compiler stacks Familiarity with RISC-V architecture and vector More ❯
london (city of london), south east england, united kingdom
microTECH Global LTD
ISA definition and enhancements Benchmark and optimize compiler performance for key workloads Contribute to documentation and developer resources Requirements: 5+ years of experience in compiler development Strong knowledge of LLVM or similar compilerinfrastructure Experience with code generation for vector architectures Understanding of graphics shader compilers and/or AI compiler stacks Familiarity with RISC-V architecture and vector More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g., PyTorch, ONNX, OpenXLA) to support performance … a focus on real-time or near real-time data processing Strong programming skills in C++ and Python for performance-sensitive applications Hands-on experience with ML compilers (e.g., LLVM, MLIR), and knowledge of runtime and scheduling optimizations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimize their execution Experience scaling AI workloads across More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g. PyTorch, ONNX, OpenXLA) to better support … focus on real-time or near real-time data processing Strong programming skills in C++ and Python, especially for performance-sensitive applications Hands-on experience with ML compilers (e.g. LLVM, MLIR), and knowledge of runtime and scheduling optimisations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimise their execution Experience scaling AI workloads across More ❯
resolve performance bottlenecks across the stack, including model execution, scheduling, and hardware constraints. Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR. Work with hardware teams to ensure software fully leverages OTPU architecture capabilities. Extend ML frameworks (e.g., PyTorch, ONNX, OpenXLA) to support performance-critical inference paths. Lead design … of distributed systems, especially real-time or near real-time data processing. Strong programming skills in C++ and Python for performance-sensitive applications. Hands-on experience with ML compilers (LLVM, MLIR) and knowledge of runtime and scheduling optimizations. Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimize their execution. Experience scaling AI workloads across More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g. PyTorch, ONNX, OpenXLA) to better support … focus on real-time or near real-time data processing Strong programming skills in C++ and Python, especially for performance-sensitive applications Hands-on experience with ML compilers (e.g. LLVM, MLIR), and knowledge of runtime and scheduling optimisations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimise their execution Experience scaling AI workloads across More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g. PyTorch, ONNX, OpenXLA) to better support … focus on real-time or near real-time data processing Strong programming skills in C++ and Python, especially for performance-sensitive applications Hands-on experience with ML compilers (e.g. LLVM, MLIR), and knowledge of runtime and scheduling optimisations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimise their execution Experience scaling AI workloads across More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g. PyTorch, ONNX, OpenXLA) to better support … focus on real-time or near real-time data processing Strong programming skills in C++ and Python, especially for performance-sensitive applications Hands-on experience with ML compilers (e.g. LLVM, MLIR), and knowledge of runtime and scheduling optimisations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimise their execution Experience scaling AI workloads across More ❯
years in performance-critical systems (HPC, HFT, AI infrastructure) Deep understanding of distributed systems and real-time data processing Strong C++ and Python skills Experience with ML compilers like LLVM, MLIR Knowledge of ML frameworks and optimization techniques Experience scaling AI workloads on custom infrastructure Debugging, profiling, performance tuning skills Degree in CS, Engineering, Math, or related field Details More ❯
bottlenecks across the stack, from model execution and scheduling to hardware-level constraints Collaborate with compiler engineers to improve code generation, execution paths, and memory layouts using tools like LLVM or MLIR Work with hardware teams to ensure the software stack fully leverages the capabilities of our OTPU architecture Extend ML frameworks (e.g. PyTorch, ONNX, OpenXLA) to better support … focus on real-time or near real-time data processing Strong programming skills in C++ and Python, especially for performance-sensitive applications Hands-on experience with ML compilers (e.g. LLVM, MLIR), and knowledge of runtime and scheduling optimisations Practical knowledge of ML frameworks like PyTorch, ONNX, or OpenXLA, and how to optimise their execution Experience scaling AI workloads across More ❯
interested in working with a world leader in this space - apply below. Required Skills & Experience needed: Prior working experience with compiler technologies may that be with Frontend/Backend LLVM or MLIR. Strong programming language skills with C and/or C++. Familiarity with a GPGPU API such as SYCL, CUDA or OpenCL. Open Source code commits and reviews … computer architecture specifications like compilers, debuggers, models. Knowledge of GPU architecture and optimization techniques for GPGPU code would be a plus but not essential. Keywords: Compiler/Compilation/LLVM/GCC/OpenSource/Linux/C/C++/Low level/Hardware/debuggers/Fortran/OpenCL/CUDA/MLIR/Machine Learning/ More ❯
Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and the ability to think critically and creatively. Experience in More ❯
Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and the ability to think critically and creatively. Experience in More ❯