Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
the AI/ML hardware stack (e.g. GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g. CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring a Research Lead at More ❯
Washington, Washington DC, United States Hybrid / WFH Options
RAND Corporation
the AI/ML hardware stack (e.g., GPUs, TPUs, data center design) Familiarity with the AI/ML software stack (e.g., CUDA, PyTorch, TensorFlow, Ray) Experience working on AI research, ML model training, or model deployment Experience with securing AI systems Education Requirements RAND is hiring multiple Visiting AI Security More ❯
or business challenges into well-defined machine learning solutions We are using many technologies day to day such as various AWS services, GCP, Kubernetes, Ray Serve, Kubeflow, and ReTool. Any experience in these areas would be a bonus Sprout.ai Values Hungry for Growth - Unleash your inner Sprout: Sprouts embrace growth More ❯
to architectural decisions. What We’re Looking For: Strong Python programming skills (5+ years preferred). Deep experience with distributed systems (e.g., Kafka, Spark, Ray, Kubernetes). Hands-on work with big data technologies and architectures. Solid understanding of concurrency, fault tolerance, and data consistency. Comfortable in a fast-paced More ❯
to architectural decisions. What We’re Looking For: Strong Python programming skills (5+ years preferred). Deep experience with distributed systems (e.g., Kafka, Spark, Ray, Kubernetes). Hands-on work with big data technologies and architectures. Solid understanding of concurrency, fault tolerance, and data consistency. Comfortable in a fast-paced More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Oliver Bernard
to architectural decisions. What We’re Looking For: Strong Python programming skills (5+ years preferred). Deep experience with distributed systems (e.g., Kafka, Spark, Ray, Kubernetes). Hands-on work with big data technologies and architectures. Solid understanding of concurrency, fault tolerance, and data consistency. Comfortable in a fast-paced More ❯
the pipeline. Collaborate with research to define data quality benchmarks . Optimize end-to-end performance across distributed data processing frameworks (e.g., Apache Spark, Ray, Airflow). Work with infrastructure teams to scale pipelines across thousands of GPUs . Work directly with the leadership on the data team roadmaps. Manage More ❯
to some of the biggest names in the insurance industry. We are developing a modern real-time ML platform using technologies like Python, PyTorch, Ray, k8s (helm + flux), Terraform, Postgres and Flink on AWS. We are very big fans of Infrastructure-as-Code and enjoy Agile practices. As a More ❯
experience in benchmarking foundational models for real-world applications. Experience in machine learning and developing AI models in frameworks such as Pytorch, TensorFlow, FSDP, Ray, and so forth. Expertise in one or more AI areas, including transfer learning, model distillation, surrogate models, and reinforcement learning. Research experience in designing and More ❯
learning research and production. High-Performance Data Pipelines Develop and optimize distributed systems for data processing, including filtering, indexing, and retrieval, leveraging frameworks like Ray, Metaflow, Spark, or Hadoop. Synthetic Data Generation Build and orchestrate pipelines to generate synthetic data at scale, advancing research on cost-efficient inference and training … rendering engines, and/or other softwares. Distributed Computing & MLOps Demonstrated proficiency in setting up large-scale, robust data pipelines, using frameworks like Spark, Ray, or Metaflow. Comfortable with model versioning, and experiment tracking. Performance Optimization Good understanding of parallel and distributed computing. Experienced with setting up evaluation methods Cloud More ❯