Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)
- Hiring Organisation
- Mercor
- Location
- Idaho, United States
- Employment Type
- Permanent
- Salary
- USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...