126 to 150 of 232 Permanent CUDA Jobs

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
Great Falls, Montana, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
Passaic, New Jersey, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
Linden, New Jersey, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
White Plains, New York, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
Union City, New Jersey, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote CUDA Kernel Optimizer - ML Engineer - AI Trainer ($120-$250 per hour)

Hiring Organisation
Mercor
Location
Las Cruces, New Mexico, United States
Employment Type
Permanent
Salary
USD 250 Hourly
Role Overview Mercor is engaging advanced CUDA experts who specialize in GPU kernel optimization, performance profiling, and numerical efficiency. These professionals possess a deep mental model of how modern GPU architectures execute deep learning workloads. They are comfortable translating algorithmic concepts into finely tuned kernels that maximize throughput while … maintaining correctness and reproducibility, 2) Key Responsibilities - Develop, tune, and benchmark CUDA kernels for tensor and operator workloads. - Optimize for occupancy, memory coalescing, instruction-level parallelism, and warp scheduling. - Profile and diagnose performance bottlenecks using Nsight Systems, Nsight Compute, and comparable tools. - Report performance metrics, analyze speedups, and propose ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Houston, Texas, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Suffolk, Virginia, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Erie, Pennsylvania, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Florissant, Missouri, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Brentwood, Tennessee, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Yonkers, New York, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
New Rochelle, New York, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Euclid, Ohio, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Blaine, Minnesota, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Hoover, Alabama, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Champaign, Illinois, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Lawrence, Indiana, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
New Bedford, Massachusetts, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Avondale, Arizona, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
State College, Pennsylvania, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Annapolis, Maryland, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Bremerton, Washington, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Ames, Iowa, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...

Remote PyTorch Operator - ML Engineer - AI Trainer ($100-$160 per hour)

Hiring Organisation
Mercor
Location
Allentown, Pennsylvania, United States
Employment Type
Permanent
Salary
USD 160 Hourly
validate Python bindings with correct gradient propagation and test coverage. - Create "golden" reference implementations in eager mode for correctness validation. - Collaborate asynchronously with CUDA or systems engineers who handle low-level kernel optimization. - Profile, benchmark, and report performance trends at the operator and graph level. - Document assumptions, APIs … Opportunity - Ideal for contractors who enjoy building clean, high-performance abstractions in deep learning frameworks. - Work is asynchronous, flexible, and outcome-oriented. - Collaborate with CUDA optimization specialists to integrate and validate kernels. - Projects may involve primitives used in state-of-the-art AI models and benchmarks. 5) Compensation & Contract ...