and C++ are nice to have. 3+ years designing and building data-intensive solutions with distributed computing. 2+ years with industry-recognized ML frameworks (scikit-learn, PyTorch, TensorFlow, Spark, Ray). 1+ year productionizing, monitoring, and maintaining models. Bachelor's degree or equivalent work experience. Preferred Qualifications: 1+ years building, scaling, and optimizing ML systems. 1+ years in data gathering More ❯
problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid working model More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Experis
problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid working model More ❯
london, south east england, united kingdom Hybrid / WFH Options
Experis
problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid working model More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Experis
problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid working model More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Experis
problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid working model More ❯
CentOS/Ubuntu), kernel tuning, and HPC stack deployment. • Experience with containerized GPU workloads using Docker, Kubernetes, and NVIDIA GPU Operator. • Familiarity with distributed compute frameworks (e.g., SLURM, Kubernetes, Ray). • Strong scripting skills: Bash, Python, or similar. • Proven ability to plan and execute large-scale system upgrades and migrations. • Candidate must, at a minimum, meet DoD 8570.11- IAT Level More ❯
in datacenter and edge settings. Familiarity with ethical AI, bias mitigation, and regulatory compliance in model development. Strong programming skills in Python and experience with distributed systems (e.g., Kubernetes, Ray). Important information for candidates Apply only through our official channels. We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed More ❯
Pallas, Triton, and/or CUDA code to achieve performance breakthroughs. Required Skills Understanding of Linux systems, performance analysis tools, and hardware optimisation techniques Experience with distributed training frameworks (Ray, Dask, PyTorch Lightning, etc.) Expertise with Python and/or C/C++ Development with machine learning frameworks (JAX, Tensorflow, PyTorch etc.) Passion for profiling, identifying bottlenecks, and delivering efficient More ❯
Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
on security topics Familiarity with the AI/ML hardware stack (e.g. GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g. CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring a Research Lead at either the specialist or expert More ❯
Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
on security topics Familiarity with the AI/ML hardware stack (e.g., GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g., CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring multiple Visiting AI Security Residents at associate, specialist, and More ❯
CentOS/Ubuntu), kernel tuning, and HPC stack deployment. Experience with containerized GPU workloads using Docker, Kubernetes, and NVIDIA GPU Operator. Familiarity with distributed compute frameworks (e.g., SLURM, Kubernetes, Ray). Strong scripting skills: Bash, Python, or similar. Proven ability to plan and execute large-scale system upgrades and migrations. Candidate must, at a minimum, meet DoD 8570.11- IAT Level More ❯
CentOS/Ubuntu), kernel tuning, and HPC stack deployment. Experience with containerized GPU workloads using Docker, Kubernetes, and NVIDIA GPU Operator. Familiarity with distributed compute frameworks (e.g., SLURM, Kubernetes, Ray). Strong scripting skills: Bash, Python, or similar. Proven ability to plan and execute large-scale system upgrades and migrations. Candidate must, at a minimum, meet DoD 8570.11- IAT Level More ❯
to take ownership of tasks Tech Stack Core : Python, FastAPI, asyncio, Airflow, Luigi, PySpark, Docker, LangGraph Data Stores : Vector Databases, DynamoDB, AWS S3, AWS RDS Cloud & MLOps : AWS, Databricks, Ray ️ Unlimited vacation time - we strongly encourage all of our employees take at least 3 weeks per year Fully remote team - choose where you live ️ Work from home stipend! We want More ❯
in Deep Learning, including training, evaluation, and optimisation. Strong grounding in mathematics, statistics, and data analysis. Experience working in Agile environments. Familiarity with technologies such as AWS, GCP, Kubernetes, Ray Serve, and Kubeflow is desirable. ---------------------------------------- Professional Values Growth: Demonstrates curiosity, adaptability, and continuous learning. Accountability: Takes ownership and delivers to a high standard. Innovation: Embraces experimentation and emerging technologies to More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end analytical solution Ability to apply novel, innovative and interdisciplinary approaches More ❯
learning tasks such as natural language processing, image recognition, semantic segmentation, reinforcement learning, approaches such as Bayesian, deep convolutional and graph neural network methods, and tools such as PyTorch, Ray, TensorFlow/board, and MLflow Interacting with decision-makers and customers to translate mission needs into an end-to-end analytical solution Ability to apply novel, innovative and interdisciplinary approaches More ❯
Desired Skills Experience working with and debugging GPU-enabled applications Familiar with LLM orchestration such as Open API API Experience with distributed processing technologies such as Spark, Dask, or Ray for data processing ETL workflows Experience with SQL, Elasticsearch, and Vector databases Experience with HTMX or Hyper-script Experience with HW/SW aspects of multi-node, multi-GPU AI … model training Knowledge of AI inferencing solutions such as Nvidia NIM/TRITON, vLLM, and direct deployment (e.g. Ray) Experience with the Atlassian suite of tools including Confluence and Jira SYSTOLIC is hosting two information sessions on October 8, 2025 about the Intel Community contracting industry and how our company fits into it. Join us and learn about our transparent More ❯
years of experience as a software engineer Nice If You Have: Experience working with and debugging GPU-enabled applications Experience with distributed processing technologies such as Spark, Dask, or Ray for data processing ETL workflows Experience with SQL, Elasticsearch, and Vector databases Experience with HTMX or Hyper-script Experience with HW/SW aspects of multi-node, multi-GPU AI … model training Knowledge of AI inferencing solutions such as Nvidia NIM/TRITON, vLLM, and direct deployment (e.g. Ray) Knowledge of LLM orchestration such as Open API API Experience with the Atlassian suite of tools including Confluence and Jira Clearance: Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified More ❯
Bethesda, Maryland, United States Hybrid / WFH Options
Base-2 Solutions, LLC
Job Description Base-2 Solutions is seeking a High Compute Engineer who will lead the design, optimization, and integration of GPU-centric high-performance compute environments. Manage existing NVIDIA A100 and DGX-1 systems while designing scalable architectures to incorporate More ❯
Chantilly, Virginia, United States Hybrid / WFH Options
Noblis
beyond traditional code writing and debugging. Understanding of how to craft prompts for discrete tasks such as data labeling and processing. Experience with asynchronous Python development using frameworks like Ray and FastAPI. Experience developing and deploying AI/ML models as part of production systems, implementing software development best practices for scalability and reliability. Experience developing and applying advanced machine … learning methods including clustering, regression, optimization, recommender engines, and artificial neural networks. Deploying models into multi-node Ray clusters utilizing various GPU resources. Experience with developing and deploying resources to cloud environments. Experience working in Linux and Windows environments and managing resources for efficiency. ACTIVE Top Secret with SCI and Polygraph Required Technologies: Python Ray and FastAPI Autogen and other More ❯
beyond traditional code writing and debugging. Understanding of how to craft prompts for discrete tasks such as data labeling and processing. Experience with asynchronous Python development using frameworks like Ray and FastAPI. Experience developing and deploying AI/ML models as part of production systems, implementing software development best practices for scalability and reliability. Experience developing and applying advanced machine … learning methods including clustering, regression, optimization, recommender engines, and artificial neural networks. Deploying models into multi-node Ray clusters utilizing various GPU resources. Experience with developing and deploying resources to cloud environments. Experience working in Linux and Windows environments and managing resources for efficiency. Demonstrated ETL expertise. Preferred Qualifications Bachelor's degree in Data Science, Statistics, Computer Science, or a More ❯
About the Team The Sensors Division at STR focuses on technology development for advanced sensor systems and platforms in support of national security. The Systems Autonomy, Analysis, and Modeling (SAAM) Group within the Sensors Division develops, adapts, and applies cutting More ❯
About the Team The Sensors Division at STR focuses on technology development for advanced sensor systems and platforms in support of national security. The Systems Autonomy, Analysis, and Modeling (SAAM) Group within the Sensors Division develops, adapts, and applies cutting More ❯