Principal Data Architect
Principal Data Architect - AI, Robotics & Scientific Platforms
Location - London or Glasgow.
This is your chance to join a high-growth deep-tech company building a next-generation platform at the intersection of AI, robotics, and scientific discovery.
You’ll play a key role in designing and owning the core data architecture that powers a globally distributed network of automated laboratories, shaping how data is captured, structured, and used to drive AI-led innovation.
Why Join
- Build the data backbone of a global AI-driven scientific platform.
- Work on complex, high-impact challenges across distributed systems, real-time data, and AI infrastructure.
- Be part of a fast-scaling, collaborative team working at the frontier of robotics, data, and science.
- Gain real ownership in defining data architecture, governance, and platform strategy from the ground up.
What You Will Be Doing
- Design and implement a scalable, ML-ready data architecture across scientific and operational systems.
- Build a data lakehouse on AWS to handle large-scale, unstructured data (sensor data, logs, experimental outputs).
- Architect real-time streaming pipelines (e.g. MQTT/event-driven systems) for high-frequency telemetry.
- Design distributed data systems that synchronise edge (lab) and cloud environments globally.
- Develop graph and semantic data models for complex, highly relational datasets.
- Integrate vector databases to enable large-scale similarity search and AI use cases.
- Define data governance frameworks, including tenancy, security, and compliance.
- Enable secure, scalable data sharing across enterprise customers and external partners.
- Help shape internal data standards, architecture patterns, and long-term platform strategy.
About You
- 8+ years experience in Data Architecture / Data Platform Engineering.
- Strong Python experience in production data systems.
- Deep experience with PostgreSQL (ideally AWS RDS).
- Proven experience building distributed data systems at scale.
- Strong understanding of streaming / event-driven architectures (e.g. MQTT, Kafka).
- Experience working with high-volume telemetry / IoT / time-series data.
- Comfortable operating as a hands-on, senior individual contributor.
Nice to Have
- Experience with graph or vector databases (Neo4j, Pinecone, pgvector).
- Background in data governance, multi-tenancy, or secure data sharing.
- Exposure to scientific, robotics, or manufacturing data environments.
- Familiarity with modern data stacks (lakehouse, streaming, real-time pipelines).
- Experience in regulated environments (SOC 2, ISO 27001).
Interested?
Apply now or reach out directly at shubhangi@akkar.com. We’re moving quickly!