Machine Learning Engineer, Amazon General Intelligence (AGI) Job ID: Services LLC - A57 Our Machine Learning training infrastructure (ML Infra) team is responsible for designing, implementing, and optimizing large-scale computing infrastructure that powers our cutting-edge AI and machine learning initiatives. We leverage advanced hardware, innovative software architectures, and distributed computing techniques to enable breakthrough research and product … top talent and recruit them to the company. - Actively mentor senior and Principal engineers, scale yourself by developing and institutionalizing best practices in AI/ML infrastructure and distributed computing across the organization. A day in the life 8+ years of professional software development experience in distributed systems with emphasis on ML infrastructure - 8+ years of current programming experience … building ML infrastructure using languages such as Python, C++ or Rust - Hands-on experience with parallelcomputing platforms such as CUDA, OpenMP, etc - Deep understanding of AI frameworks such as PyTorch, TensorFlow, and JAX, and their demands on underlying compute infrastructure, memory bandwidth, network interconnect, and storage as scale goes up - Knowledge of emerging AI hardware accelerators and More ❯
development, with a focus on building performant, scalable systems. Deep understanding of core Python, including its strengths in data manipulation, asynchronous programming, and performance optimization. Experience with distributed systems, parallelcomputing, and high-performance processing of large datasets. Strong experience in data pipelines, working with tools such as Pandas, NumPy, and SQL/NoSQL databases. Proven experience working More ❯
development, with a focus on building performant, scalable systems. Deep understanding of core Python, including its strengths in data manipulation, asynchronous programming, and performance optimization. Experience with distributed systems, parallelcomputing, and high-performance processing of large datasets. Strong experience in data pipelines, working with tools such as Pandas, NumPy, and SQL/NoSQL databases. Proven experience working More ❯
programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallelcomputing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and resolving system bottlenecks. Experience in system-level programming More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Annapurna
programming, with a strong grasp of modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallelcomputing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and resolving system bottlenecks. Experience in system-level programming More ❯
Live operation of such systems, including monitoring and pro-active detection of potential problems and intervention Stay current on state-of-the-art technologies and tools including technical libraries, computing environments and academic research Collaborate with the PM and the trading group in a transparent environment, engaging with the whole investment process Preferred Technical Skills Expert in Python and …/or KDB/Q Proficient in modern data science tools stacks (Jupyter, pandas, numpy, sklearn) with machine learning experience Good understanding of using Slurm or similar parallelcomputing tools Bachelor's or Master's degree in Computer Science, Mathematics, Statistics, or related STEM field from top ranked University Proficient in quantitative analysis, mathematical modelling, statistics, regression, and More ❯
a huge plus. Performance Engineering Excellence: Proven ability to diagnose, profile, and optimize complex systems using advanced performance analysis tools and methodologies. Demonstrated experience in tuning multi-threaded and parallelcomputing environments, managing concurrency, and applying lock-free designs for efficient resource utilization. Familiarity with performance engineering technologies and low-cost always on profiling, metrics and observability. Extensive More ❯
programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallelcomputing, vectorization, and multi-core systems for high-performance computing (HPC). Experience with profiling tools (e.g., NVIDIA Nsight, gdb, perf) and performance tuning in a GPU … problem-solving skills and a keen interest in optimizing systems for ML workloads. A passion for machine learning, AI, and innovative technology. Nice to Have: Experience with high-performance computing (HPC) and large-scale distributed systems. Knowledge of AI/ML libraries such as cuDNN, TensorRT, or other GPU-accelerated libraries. Familiarity with low-level debugging tools and profiling More ❯
programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms, including model training and inference, and how to optimize these for GPU-based computation. Strong knowledge of parallelcomputing, vectorization, and multi-core systems for high-performance computing (HPC). Experience with profiling tools (e.g., NVIDIA Nsight, gdb, perf) and performance tuning in a GPU … problem-solving skills and a keen interest in optimizing systems for ML workloads. A passion for machine learning, AI, and innovative technology. Nice to Have: Experience with high-performance computing (HPC) and large-scale distributed systems. Knowledge of AI/ML libraries such as cuDNN, TensorRT, or other GPU-accelerated libraries. Familiarity with low-level debugging tools and profiling More ❯
real time to ensure seamless trading operations. Requirements: At least 2 years of experience developing algorithmic trading systems. Strong programming skills in Python and KDB/Q. Familiarity with parallelcomputing frameworks such as multiprocessing or multithreading. Solid understanding of modern software development practices, including version control, unit testing, and debugging. Strong foundation in quantitative analysis, statistics, mathematical More ❯
real time to ensure seamless trading operations. Requirements: At least 2 years of experience developing algorithmic trading systems. Strong programming skills in Python and KDB/Q. Familiarity with parallelcomputing frameworks such as multiprocessing or multithreading. Solid understanding of modern software development practices, including version control, unit testing, and debugging. Strong foundation in quantitative analysis, statistics, mathematical More ❯