environments) Hands-on with Docker, Kubernetes, and Terraform Strong scripting skills in Python or Bash Familiar with ML lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM is a big plus Bonus Points Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure or GCP More ❯
paced, collaborative, and dynamic environment. Nice to haves: Prior experience with PCB design, EDA tools, or related optimization problems. Hands-on experience in high-performance computing environments (e.g., Kubernetes, Ray, Dask). Contributions to open-source projects, publications, or top placements in ML competitions (e.g., Kaggle). Expertise in related fields such as Computer Vision, Representation Learning, or Simulation Environments. More ❯
paced, collaborative, and dynamic environment. Nice to haves: Prior experience with PCB design, EDA tools, or related optimization problems. Hands-on experience in high-performance computing environments (e.g., Kubernetes, Ray, Dask). Contributions to open-source projects, publications, or top placements in ML competitions (e.g., Kaggle). Expertise in related fields such as Computer Vision, Representation Learning, or Simulation Environments. More ❯
data modeling, architecture, and processing unstructured data. Experience with processing 3D geometric data. Experience with large-scale, data-intensive systems in production. Knowledge of distributed computing frameworks (Spark, Dask, Ray). Experience with cloud platforms (AWS, Azure, GCP). Proficiency with Docker, Linux, and bash. Ability to document code, architectures, and experiments. Preferred Qualifications Experience with databases and data warehousing More ❯
CentOS/Ubuntu), kernel tuning, and HPC stack deployment. Experience with containerized GPU workloads using Docker, Kubernetes, and NVIDIA GPU Operator. Familiarity with distributed compute frameworks (e.g., SLURM, Kubernetes, Ray). Strong scripting skills: Bash, Python, or similar. Proven ability to plan and execute large-scale system upgrades and migrations. Candidate must, at a minimum, meet DoD 8570.11- IAT Level More ❯
user base. As a team, we are a collaborative, cross-functional group with backgrounds in information retrieval, natural language processing, and distributed systems. We work with Go microservices, Python, Ray Serve, Kubernetes/KubeRay, and work on AWS, GCP & Azure. We provide thought leadership across a variety of mediums including open code repositories, publishing blogs, and speaking at conferences. We … Bring 5+ years working in an MLOps or related ML Engineering role Production experience self-hosting & operating LLMs at scale for generative tasks via an inference framework such as Ray or KServe (or similar) Production experience with running and tuning specialized hardware for Generative AI workloads, especially GPUs via CUDA Measured and articulate written and spoken communication skills. You work More ❯
functional engineering teams. Passion for open-source and decentralized infrastructure. Excellent communication and executive presence. Preferred Tech Stack Languages: Go, Rust, Python, Solidity AI Stack (plus): PyTorch, Hugging Face, Ray, ONNX What We Offer Competitive salary + equity/token package Flexible, remote-first work environment High-impact leadership role in a fast-scaling frontier tech company Opportunity to shape More ❯
Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
on security topics Familiarity with the AI/ML hardware stack (e.g. GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g. CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring a Research Lead at either the specialist or expert More ❯
Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
on security topics Familiarity with the AI/ML hardware stack (e.g., GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g., CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring multiple Visiting AI Security Residents at associate, specialist, and More ❯
Annapolis Junction, Maryland, United States Hybrid / WFH Options
Lockheed Martin
using Kotlin and updating existing REST APIs. Responsibilities include being available to address any issues that may occur in the production environment. Desired Skills • Experience with RabbitMQ • Experience with Ray • Experience with Spring • Experience with FastAPI • Experience with S3 • Experience with AWS • Experience with PostgresSQL (or similar SQL implementation) • Experience with Elasticsearch Clearance Level: TS/SCI w/Poly More ❯
Our Mission Our mission is to restore cell health and resilience through cell rejuvenation to reverse disease, injury, and the disabilities that can occur throughout life. For more information, see our website at Our Value Our Single Altos Value: Everyone More ❯
Chantilly, Virginia, United States Hybrid / WFH Options
Noblis
beyond traditional code writing and debugging. Understanding of how to craft prompts for discrete tasks such as data labeling and processing. Experience with asynchronous Python development using frameworks like Ray and FastAPI. Experience developing and deploying AI/ML models as part of production systems, implementing software development best practices for scalability and reliability. Experience developing and applying advanced machine … learning methods including clustering, regression, optimization, recommender engines, and artificial neural networks. Deploying models into multi-node Ray clusters utilizing various GPU resources. Experience with developing and deploying resources to cloud environments. Experience working in Linux and Windows environments and managing resources for efficiency. ACTIVE Top Secret with SCI and Polygraph Required Technologies: Python Ray and FastAPI Autogen and other More ❯
accelerating ML research and deployment in creative space. Bonus Skills (Nice-to-Have) Experience with ML pipelines involving video, image, or 3D data. Familiarity with distributed compute frameworks (e.g., Ray) or orchestration tools (e.g., Flyte). Familiarity with game engines (Unreal or Unity) Knowledge of vector databases and similarity search (e.g., LanceDB). Prior work in AI/ML research More ❯
high-quality data flows through the pipeline. Collaborate with research to define data quality benchmarks . Optimize end-to-end performance across distributed data processing frameworks (e.g., Apache Spark, Ray, Airflow). Work with infrastructure teams to scale pipelines across thousands of GPUs . Work directly with the leadership on the data team roadmaps. Manage the team of data engineers. More ❯
Ability to convert customer requirements or business challenges into well-defined machine learning solutions We are using many technologies day to day such as various AWS services, GCP, Kubernetes, Ray Serve, Kubeflow, and ReTool. Any experience in these areas would be a bonus Sprout.ai Values Hungry for Growth - Unleash your inner Sprout: Sprouts embrace growth, forget comfort zones, and help More ❯
monitoring Experience with Git source control Desired Skills: Experience debugging GPU-enabled applications Familiarity with LLM orchestration (e.g., OpenAI API) Experience with distributed processing frameworks like Spark, Dask, or Ray for ETL workflows Proficiency in SQL, Elasticsearch, and vector databases Knowledge of HTMX or Hyper-script Experience with multi-node, multi-GPU AI model training (HW/SW) Knowledge of … AI inferencing frameworks: Nvidia NIM/TRITON, vLLM, Ray Familiarity with the Atlassian suite: Confluence, Jira More ❯
thinking, and influence across teams A bias for action, accountability, and leading by example Preferred Skills (Nice to Have) Experience with AI/ML infrastructure tooling (e.g. vLLM, KServe, Ray, Triton) Familiarity with Python, especially ML libraries and model interfaces Exposure to GPU orchestration frameworks or building services for model training/inference Understanding of multi-tenant systems, isolation strategies … thinking, and influence across teams A bias for action, accountability, and leading by example Preferred Skills (Nice to Have) Experience with AI/ML infrastructure tooling (e.g. vLLM, KServe, Ray, Triton) Familiarity with Python, especially ML libraries and model interfaces Exposure to GPU orchestration frameworks or building services for model training/inference Understanding of multi-tenant systems, isolation strategies More ❯
CI Monitoring and telemetry tools: Prometheus, Grafana Version control with Git Preferred Qualifications: Experience debugging GPU-enabled applications Familiarity with OpenAPI, HTMX, or Hyperscript Experience with Spark, Dask, or Ray for distributed data workflows Knowledge of AI inference tools (e.g., Nvidia NIM, Triton, vLLM) Understanding of multi-GPU model training and hardware integration Experience with SQL, Elasticsearch, or vector databases More ❯
experiments Experience with ML model monitoring systems Experience with ML training and data pipelines and working with distributed systems Proficiency with modern deep learning libraries and frameworks (PyTorch, Lightning, Ray) Preferred Qualifications Experience owning a product from development through monitoring and incident response Knowledge of the design, manufacturing, AEC, or media & entertainment industries Experience with Autodesk or similar products (CAD More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end analytical solution Ability to apply novel, innovative and interdisciplinary approaches More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end analytical solution Ability to apply novel, innovative and interdisciplinary approaches More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end solution Ability to apply novel, innovative and interdisciplinary approaches to More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end solution Ability to apply novel, innovative and interdisciplinary approaches to More ❯
Desired Skills Experience working with and debugging GPU-enabled applications Familiar with LLM orchestration such as Open API API Experience with distributed processing technologies such as Spark, Dask, or Ray for data processing ETL workflows Experience with SQL, Elasticsearch, and Vector databases Experience with HTMX or Hyper-script Experience with HW/SW aspects of multi-node, multi-GPU AI … model training Knowledge of AI inferencing solutions such as Nvidia NIM/TRITON, vLLM, and direct deployment (e.g. Ray) Experience with the Atlassian suite of tools including Confluence and Jira Qualifications An active TS/SCI with polygraph is required. Offering the very best compensation packages and the flexibility to let our employees decide what's most important to them. More ❯
Desired Skills Experience working with and debugging GPU-enabled applications Familiar with LLM orchestration such as Open API API Experience with distributed processing technologies such as Spark, Dask, or Ray for data processing ETL workflows Experience with SQL, Elasticsearch, and Vector databases Experience with HTMX or Hyper-script Experience with HW/SW aspects of multi-node, multi-GPU AI … model training Knowledge of AI inferencing solutions such as Nvidia NIM/TRITON, vLLM, and direct deployment (e.g. Ray) Experience with the Atlassian suite of tools including Confluence and Jira Qualifications An active TS/SCI with polygraph is required. Offering the very best compensation packages and the flexibility to let our employees decide what's most important to them. More ❯