/MS in Computer Science, Software Engineering, or equivalent technical discipline. 8+ years of hands-on experience building large-scale distributed data pipelines and architectures. Expert-level knowledge in Apache Spark, PySpark, and Databricksincluding experience with Delta Lake, Unity Catalog, MLflow, and Databricks Workflows. Deep proficiency in Python and SQL, with proven experience building modular, testable, reusable pipeline components. More ❯
City of London, Greater London, UK Hybrid / WFH Options
Bondaval
Science (or similar) from a good University highly desirable. Nice to Have: Familiarity with message brokers (Kafka, SQS/SNS, RabbitMQ). Knowledge of real-time streaming (Kafka Streams, Apache Flink, etc.). Exposure to big-data or machine-learning frameworks (TensorFlow, PyTorch, Hugging Face, LangChain). Understanding of infrastructure and DevOps (Terraform, Ansible, AWS, Kubernetes). Exposure to More ❯
City of London, Greater London, UK Hybrid / WFH Options
Areti Group | B Corp
and government organisations , delivering real-world innovation powered by data and technology . Tech Stack & Skills We're Looking For: Palantir Azure Databricks Microsoft Azure Python Docker & Kubernetes Linux Apache Tools Data Pipelines IoT (Internet of Things) Scrum/Agile Methodologies Ideal Candidate: Already DV Cleared or at least SC Strong communication skills comfortable working directly with clients and More ❯
City of London, Greater London, UK Hybrid / WFH Options
Futuria
data integrity, consistency, and accuracy across systems. Optimize data infrastructure for performance, cost efficiency, and scalability in cloud environments. Develop and manage graph-based data systems (e.g. Kuzu, Neo4j, Apache AGE) to model and query complex relationships in support of Retrieval Augmented Generation (RAG) and agentic architectures. Contribute to text retrieval pipelines involving vector embeddings and knowledge graphs, for … workflows. Proficiency with cloud platforms such as Azure, AWS, or GCP and their managed data services. Desirable: Experience with asynchronous python programming Experience with graph technologies (e.g., Kuzu, Neo4j, Apache AGE). Familiarity with embedding models (hosted or local): OpenAI, Cohere etc or HuggingFace models/sentence-transformers. Solid understanding of data modeling, warehousing, and performance optimization. Experience with … messaging middleware + streaming (e.g. NATS Jetstream, Redis Streams, Apache Kafka or Pulsar etc.) Hands-on experience with data lakes, lakehouses, or components of the modern data stack. Exposure to MLOps tools and best practices. Exposure to workflow orchestration frameworks (e.g. Metaflow, Airflow, Dagster) Exposure to Kubernetes Experience working with unstructured data (e.g., logs, documents, images). Awareness of More ❯