Staff Machine Learning Architect - Assembly Coding and Performance Engineer
Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Hybrid / WFH Options
Arm Limited
Job Overview: High-performance ML workloads on Arm CPUs require the co-development of algorithms and highly optimized CPU kernels. In CT-ML (Central Technology, Machine Learning), rapid kernel prototyping is crucial for exploring algorithms and assessing trade-offs between model accuracy and performance. Successful prototypes drive future CPU … of a dedicated team within the CT-ML group focused on analyzing ML workloads and rapidly prototyping highly optimized CPU kernels to enhance model performance and accuracy. Required Skills and Experience: Strong interest and passion for implementing high-performance kernel code in dynamic environments. 4+ years of experience … in implementing high-performance CPU kernels with vector and matrix extensions. Experience measuring and understanding performance metrics. Experience in creating efficient kernel development frameworks, including tools and testing methodologies. Deep understanding of CPU architecture. "Nice To Have" Skills and Experience: Knowledge of ML models and algorithms is a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: