custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
may be a good fit if you: Have 5+ years of industry-related experience Are proficient in Python and have experience with deep learning frameworks such as PyTorch or Jax Have a strong software engineering background and are interested in working closely with researchers and other engineers Enjoy pair programming (we love to pair!) Care about code quality, testing, and More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Samsung Electronics Perú
/EE or related research experience in academia or industry We will consider various levels of experience in relevant research areas Key Skills: Experience with ML frameworks (PyTorch, TensorFlow, JAX) and efficient ML (incl. quantization, pruning, sparsification, distillation, etc.) Experience with deployment on embedded/mobile devices (such as smartphones, with mobile CPU, GPU, NPU) Experience with distributed and multi More ❯
role demands deep expertise in C and C++ programming, ML framework internals, compiler construction, and optimisation techniques. Key Deliverables: Implement Runtime: Build a runtime that seamlessly integrates with PyTorch, JAX, and TensorFlow (PJRT) for both training and inference execution patterns. The runtime must support asynchronous execution and multiple devices. Implement Compiler: Build a compiler that is extensible to future optimisation … patterns across operation fusion, layout optimisation, tiling, and scheduling. Implement Debugger & Diagnostics: Support optional runtime assertions and compile-time dumps, tensorboard timelines, and JAX I/O callbacks. Implement Functional Simulator: Build a functional simulator that mocks our kernel-space driver, allowing the software team to lower operations ahead of hardware teams. Skills & Experience: 5+ years of experience in software More ❯
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
london (city of london), south east england, united kingdom
Flux Computing
re searching for Senior ML Compiler Engineers to join the team building the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs that connect PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers.Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing the … roadmap that unlock key high-impact technical and business milestones that drive the success of Flux. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimise Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximise throughput. Mentor & Encourage Standards: Lead code reviews, coach peers, and uphold … engineering with a focus on C/C++ programming. Deep expertise in ML framework internals, compilers, low-level programming, and optimisation techniques. Deep expertise in optimising Tensorflow, PyTorch or JAX deep learning models. Deep expertise with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
a compiler extensible to future optimisation patterns such as op fusion, layout optimisation, tiling, and scheduling. Debugger & Diagnostics Implement optional runtime assertions, compile-time dump mechanisms, TensorBoard timelines, and JAX I/O callback support. Functional Simulator Develop a simulator that mimics our kernel-space driver, enabling the software team to lower operations ahead of hardware availability. Required Skills & Experience … 5+ years of professional experience in C/C++ software engineering Strong background in compilers , runtime systems , and low-level optimisations Deep familiarity with ML frameworks (e.g., PyTorch, JAX, TensorFlow) and their execution models Experience with high-performance computing or hardware-software co-design Strong problem-solving skills with a creative and pragmatic mindset Comfortable operating in fast-paced , ambiguous More ❯
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
london (city of london), south east england, united kingdom
Flux Computing
searching for Staff Compiler Engineers to architect and build the ML backend (compiler, run-time, and debugger) for our next-generation OPTUs. You will own integration with PyTorch, Tensorflow, JAX, and MXNet down to our low-level kernel drivers. Your mission will be to create seamless support for a broad ecosystem of large AI models, and ensure we are pushing … requirements with the OTPU design; ensure software and hardware are designed together to deliver maximum performance. Architect & Build: Design and implement our compiler, runtime, and debugger for PyTorch, TensorFlow, JAX, and MXNet on custom hardware. Optimize Performance: Apply advanced techniques (layout, fusion, scheduling, tiling) to eliminate bottlenecks and maximize throughput. Mentor & Define Standards: Lead code reviews, coach peers, and define … software engineering with a focus on C/C++ programming. Extensive experience in ML framework internals, compilers, low-level programming, and optimisation techniques. Extensive experience optimising Tensorflow, PyTorch or JAX deep learning models. Extensive experience with multiple toolchains like LLVM, OpenXLA/XLA, MLIR, TVM. Practical experience applying machine learning in high-performance computing contexts. Strong problem-solving skills and More ❯
language modeling architectures (e.g. transformers, SSMs) Solid development skills in Python and/or C++ Familiarity with ML libraries/frameworks such as PyTorch (preferred), TensorFlow, and/or JAX Intellectual curiosity, versatility, and originality combined with a pragmatic outlook Ability to reason through quantitative problems and communicate effectively with trading researchers Reliable and predictable availability Bonus Points Experience with More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
london (city of london), south east england, united kingdom
Flux Computing
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯
custom in-house). Solid grasp of computer-architecture fundamentals: memory systems, interconnects, queuing theory, Amdahl/Gustafson analysis. Familiarity with machine-learning workloads and common frameworks (PyTorch, TensorFlow, JAX). Comfort reading RTL or schematics and discussing micro-architectural trade-offs with hardware designers. Excellent data-visualisation and communication skills: able to turn millions of simulation samples into one More ❯