honest approach to problem-solving, and ability to collaborate with peers, stakeholders and management Industry experience with machine learning teams Working knowledge of common ML frameworks such as PyTorch, ONNX, DeepSpeed etc. Prior experience with cloud-native technologies like Kubernetes, Argo Workflows, Buildpacks, etc. Experience with cloud providers such as AWS, GCP or Azure A track record of collaboration with More ❯
experience communicating methodological choices and model results. • Demonstrated experience with verification and validation test benches. • Demonstrated experience with Explainable AI (XAI) techniques. • Demonstrated experience with OpenNeural Net Exchange (ONNX). More ❯
in a multi-team project; 3 years of experience with deep learning; 3 years of experience in deep learning programming languages, frameworks, and tooling, including PyTorch, HuggingFace Transformers, and ONNX; 3 years of experience with mixed-precision computing in deep learning frameworks; 2 years of experience with one or more distributed training approaches such as DDP, FSDP, or DeepSpeed More ❯
implementation. Extensive experience with common machine learning Python frameworks such as TensorFlow and PyTorch; and Python libraries such as pandas, and computer vision libraries such as OpenCV. Experience in ONNX and TensorRT. Very comfortable working in Linux environment. Familiarity with software development tools and agile development practices. 6 years experience in developing, optimizing, and testing deep learning in computer vision More ❯
Demonstrated academic or professional experience communicating methodological choices and model results. • Demonstrated experience with verification and validation test benches. • Demonstrated experience with Explainable AI (XAI) techniques. • Demonstrated experience with ONNX (OpenNeural Net Exchange) Salary Range: $150,000-$200,000 All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national More ❯
Enhance system reliability and observability, and manage system outages Research and implement optimizations for LLM inference Qualifications Experience with ML systems and deep learning frameworks such as PyTorch, TensorFlow, ONNX Knowledge of LLM architectures and inference optimization techniques (e.g., batching, quantization) Experience deploying scalable, reliable, real-time model serving systems (Optional) GPU architecture understanding or CUDA programming experience The compensation More ❯
trends in ML models, software stacks, and hardware architectures. PREFERRED EXPERIENCE: Proficiency in Python and C/C++ programming. Deep understanding of AI/ML algorithms, frameworks (e.g., PyTorch, ONNX), and model representations. Experience in analytical modeling of ML operators regarding compute and data movement. Background in optimization libraries and solvers like PuLP, CBC, Gurobi is advantageous. Effective communication and More ❯
a related discipline. Strong expertise in PyTorch and C++ programming. Experience with ML workload analysis, compiler development, and quantization techniques. Familiarity with deep learning frameworks such as TensorFlow or ONNX is a plus. Proven track record of solving complex performance and efficiency challenges in hardware-aware ML solutions. Ability to work collaboratively with strategic customers and deliver impactful results. Excellent More ❯
Version Control Optimization Strategic Thinking & Problem Solving Desirable: Excellent cross-cultural communication and leadership skills for distributed teams Ability to manage up effectively with senior leadership Experience with ML.NET, ONNX Runtime, Semantic Kernel, and/or RavenDB AI capabilities. Exposure to managing geographically dispersed teams across multiple time zones Prior success in leading organizational transformation initiatives Key duties: Build a More ❯
define, prototype, and ship new AI-powered features including text-to-speech, image generation, and enhanced tool calling capabilities Implement and optimize model serving infrastructure using frameworks like vLLM, ONNX Runtime, and Nvidia Triton to achieve production-scale performance requirements Collaborate with DevOps teams on MLOps infrastructure including model monitoring, load testing, caching optimization, and automated CI/CD pipelines … engineering background with production experience Extensive experience with PyTorch or other modern ML frameworks Experience training custom models from scratch Experience with model optimization and inference frameworks (e.g., vLLM, ONNX Runtime, Nvidia Triton) Familiarity with MLOps practices & Kubernetes and ability to collaborate with DevOps teams on model monitoring, load testing, and CI/CD pipelines Experience shipping ML-powered features More ❯