Robotics data engineer
Company Overview
CNTXT AI is a UAE-based Data and AI company that helps organizations become AI-ready by delivering sovereign, end-to-end solutions rooted in the region and built to scale globally.
We provide secure data services, custom AI solutions, and industry-specific applications, enabling seamless adoption and deployment across enterprise and government environments.
Our expertise spans the full AI lifecycle, from data advisory, labeling, and annotation to advanced machine learning. With Arabic-first, domain-specific innovation, CNTXT AI delivers solutions that solve real-world problems while ensuring full control, security, and sovereignty over data.
Role Overview
We are seeking a Senior Robotics Data Engineer with strong experience in robotics data systems and, preferably, humanoid robot platforms. This role will focus on building scalable pipelines for collecting, processing, labeling, and managing multimodal robotics data (vision, LiDAR, IMU, force sensors, telemetry) used in perception, navigation, manipulation, and autonomy.
You will work at the intersection of robotics engineering, machine learning, and data infrastructure, enabling CNTXT AI’s robotics lab to train and deploy robust AI models in real-world environments.
Key Responsibilities
Robotics Data Pipeline Development
- Design and implement end-to-end data pipelines for robotics systems, including humanoid platforms
- Manage ingestion, synchronization, and storage of multimodal sensor data:
- RGB / Depth cameras
- LiDAR / Radar
- IMU + joint encoders
- Force/torque sensors
- Telemetry + control logs
- Ensure reliable dataset generation from real-world robot deployments and simulation environments
Humanoid Robotics Data Systems (Preferred Focus)
- Support data workflows for humanoid robot perception and manipulation tasks such as:
- Whole-body motion tracking
- Object grasping and dexterous manipulation
- Human-robot interaction data
- Locomotion and balance datasets
- Build infrastructure to capture and curate datasets for embodied AI and foundation robotics models
Data Quality, Annotation & Labeling
- Develop scalable workflows for robotics dataset labeling, including:
- 2D/3D bounding boxes
- Semantic + instance segmentation
- Keypoint tracking (human pose / robot joints)
- Scene graph and task annotations
- Partner with annotation teams to enforce quality standards and feedback loops
Sensor Fusion & Dataset Alignment
- Implement robust sensor calibration, timestamp alignment, and data synchronization pipelines
- Support downstream ML workflows in:
- SLAM
- Sensor fusion
- Perception stacks
- Autonomous navigation
Collaboration with AI & Robotics Teams
- Work closely with robotics engineers, ML researchers, and autonomy teams to define data requirements
- Enable training-ready datasets for computer vision, reinforcement learning, imitation learning, and robotics foundation models
- Maintain documentation, best practices, and scalable data operations processes
Infrastructure & Tooling
- Build and maintain data infrastructure using modern tools such as:
- ROS / ROS2 bag pipelines
- Python, C++, SQL
- Cloud storage + distributed compute
- Data versioning tools (DVC, LakeFS, etc.)
- MLOps platforms (MLflow, Kubeflow is a plus)
- Optimize robotics dataset performance for training at scale
Qualifications
Required
- 5+ years of experience in robotics data engineering, perception data pipelines, or autonomous systems
- Strong proficiency in Python and experience with robotics software stacks (ROS/ROS2)
- Experience working with multimodal robotics sensor datasets (vision + LiDAR + IMU)
- Strong understanding of robotics data challenges: synchronization, calibration, noise, drift
- Experience building scalable pipelines for AI/ML training datasets
- Excellent problem-solving skills in real-world robotics environments
Preferred (Humanoid Robotics Background)
- Experience working directly with humanoid robots (Atlas, Digit, Tesla Optimus-style systems, etc.)
- Familiarity with locomotion, manipulation, and embodied AI datasets
- Exposure to reinforcement learning or imitation learning pipelines
- Experience with simulation tools (Isaac Sim, Mujoco, Gazebo)
- Knowledge of foundation model approaches for robotics
Why Join CNTXT AI?
- Work at the cutting edge of humanoid robotics + AI data infrastructure
- Build real-world robotics datasets powering autonomy and embodied intelligence
- Collaborate with world-class AI researchers and robotics engineers
- Be part of the UAE’s sovereign AI and robotics innovation ecosystem