and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
london (city of london), south east england, united kingdom
Vallum Associates
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
staying current with emerging data technologies. Technical Requirements Proficiency in SQL , including complex query design and optimisation. Strong Python programming skills, particularly with libraries such as pandas , NumPy , and ApacheSpark . Experience building and maintaining data ingestion pipelines and optimising performance. Hands-on experience with open-source data frameworks such as ApacheSpark , Apache Kafka , or Apache Airflow . Knowledge of distributed computing and big data concepts. Experience using version control systems (Git) and CI/CD practices. Familiarity with relational databases (PostgreSQL, MySQL, or similar). Experience with containerisation technologies ( Docker , Kubernetes ). Understanding of data orchestration tools (e.g., Airflow or Dagster). Knowledge of data warehousing principles and dimensional modelling More ❯
staying current with emerging data technologies. Technical Requirements Proficiency in SQL , including complex query design and optimisation. Strong Python programming skills, particularly with libraries such as pandas , NumPy , and ApacheSpark . Experience building and maintaining data ingestion pipelines and optimising performance. Hands-on experience with open-source data frameworks such as ApacheSpark , Apache Kafka , or Apache Airflow . Knowledge of distributed computing and big data concepts. Experience using version control systems (Git) and CI/CD practices. Familiarity with relational databases (PostgreSQL, MySQL, or similar). Experience with containerisation technologies ( Docker , Kubernetes ). Understanding of data orchestration tools (e.g., Airflow or Dagster). Knowledge of data warehousing principles and dimensional modelling More ❯
london (city of london), south east england, united kingdom
Norton Blake
staying current with emerging data technologies. Technical Requirements Proficiency in SQL , including complex query design and optimisation. Strong Python programming skills, particularly with libraries such as pandas , NumPy , and ApacheSpark . Experience building and maintaining data ingestion pipelines and optimising performance. Hands-on experience with open-source data frameworks such as ApacheSpark , Apache Kafka , or Apache Airflow . Knowledge of distributed computing and big data concepts. Experience using version control systems (Git) and CI/CD practices. Familiarity with relational databases (PostgreSQL, MySQL, or similar). Experience with containerisation technologies ( Docker , Kubernetes ). Understanding of data orchestration tools (e.g., Airflow or Dagster). Knowledge of data warehousing principles and dimensional modelling More ❯
of our clients data platform. This role is ideal for someone who thrives on building scalable data solutions and is confident working with modern tools such as Azure Databricks , Apache Kafka , and Spark . In this role, you'll play a key part in designing, delivering, and optimising data pipelines and architectures. Your focus will be on enabling … and want to make a meaningful impact in a collaborative, fast-paced environment, we want to hear from you !! Role and Responsibilities Designing and building scalable data pipelines using ApacheSpark in Azure Databricks Developing real-time and batch data ingestion workflows, ideally using Apache Kafka Collaborating with data scientists, analysts, and business stakeholders to build high … and Experience We're seeking candidates who bring strong technical skills and a hands-on approach to modern data engineering. You should have: Proven experience with Azure Databricks and ApacheSpark Working knowledge of Apache Kafka and real-time data streaming Strong proficiency in SQL and Python Familiarity with Azure Data Services and CI/CD pipelines More ❯
Technical Expertise: Solid experience in Python programming, particularly using data manipulation and processing libraries such as Pandas, NumPy, and Apache Spark. Hands-on experience with open-source data frameworks like ApacheSpark, Apache Kafka, and Apache Airflow. Strong proficiency in SQL, including advanced query development and performance tuning. Good understanding of distributed computing principles and … automation pipelines. Experience working with relational databases such as PostgreSQL, MySQL, or equivalent platforms. Skilled in using containerization technologies including Docker and Kubernetes. Experience with workflow orchestration tools like Apache Airflow or Dagster. Familiar with streaming data pipelines and real-time analytics solutions. More ❯
Technical Expertise: Solid experience in Python programming, particularly using data manipulation and processing libraries such as Pandas, NumPy, and Apache Spark. Hands-on experience with open-source data frameworks like ApacheSpark, Apache Kafka, and Apache Airflow. Strong proficiency in SQL, including advanced query development and performance tuning. Good understanding of distributed computing principles and … automation pipelines. Experience working with relational databases such as PostgreSQL, MySQL, or equivalent platforms. Skilled in using containerization technologies including Docker and Kubernetes. Experience with workflow orchestration tools like Apache Airflow or Dagster. Familiar with streaming data pipelines and real-time analytics solutions. More ❯
london (city of london), south east england, united kingdom
Vallum Associates
Technical Expertise: Solid experience in Python programming, particularly using data manipulation and processing libraries such as Pandas, NumPy, and Apache Spark. Hands-on experience with open-source data frameworks like ApacheSpark, Apache Kafka, and Apache Airflow. Strong proficiency in SQL, including advanced query development and performance tuning. Good understanding of distributed computing principles and … automation pipelines. Experience working with relational databases such as PostgreSQL, MySQL, or equivalent platforms. Skilled in using containerization technologies including Docker and Kubernetes. Experience with workflow orchestration tools like Apache Airflow or Dagster. Familiar with streaming data pipelines and real-time analytics solutions. More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
london (city of london), south east england, united kingdom
Capgemini
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
the core infrastructure for our real-time data streaming platform, ensuring high availability, reliability, and low latency. Implement and optimize data pipelines and stream processing applications using technologies like Apache Kafka, Apache Flink, and Spark Streaming. Collaborate with software and data engineering teams to define event schemas, ensure data quality, and support the integration of new services … with every single technology listed. We encourage you to apply if you have a strong foundation in a majority of these areas. Streaming Platforms & Architecture: Strong production experience with Apache Kafka and its ecosystem (e.g., Confluent Cloud, Kafka Streams, Kafka Connect). Solid understanding of distributed systems and event-driven architectures and how they drive modern microservices and data … pipelines. Real-Time Data Pipelines: Experience building and optimizing real-time data pipelines for ML, analytics and reporting, leveraging technologies such as Apache Flink, Spark Structured Streaming, and integration with low-latency OLAP systems like Apache Pinot. Platform Infrastructure & Observability: Hands-on experience with major Cloud Platforms (AWS, GCP, or Azure), Kubernetes and Docker, coupled with proficiency More ❯
of data modelling and data warehousing concepts Familiarity with version control systems, particularly Git Desirable Skills: Experience with infrastructure as code tools such as Terraform or CloudFormation Exposure to ApacheSpark for distributed data processing Familiarity with workflow orchestration tools such as Airflow or AWS Step Functions Understanding of containerisation using Docker Experience with CI/CD pipelines More ❯
of data modelling and data warehousing concepts Familiarity with version control systems, particularly Git Desirable Skills: Experience with infrastructure as code tools such as Terraform or CloudFormation Exposure to ApacheSpark for distributed data processing Familiarity with workflow orchestration tools such as Airflow or AWS Step Functions Understanding of containerisation using Docker Experience with CI/CD pipelines More ❯
london, south east england, united kingdom Hybrid / WFH Options
Experis
Excellent problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Experis
Excellent problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Experis
Excellent problem-solving skills and ability to work independently in a fast-paced environment. Desirable: Experience with NLP, computer vision, or time-series forecasting. Familiarity with distributed computing frameworks (Spark, Ray). Experience with MLOps and model governance practices. Previous contract experience in a similar ML engineering role. Contract Details Duration: 6–12 months (extension possible) Location: London (Hybrid More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Advanced Resource Managers Limited
experience with Trino/Starburst Enterprise/Galaxy administration/CLI. Implementation experience with container orchestration solutions (Kubernetes/OpenShift). Knowledge of Big Data (Hadoop/Hive/Spark) and Cloud technologies (AWS, Azure, GCP). Understanding of distributed system architecture, high availability, scalability, and fault tolerance. Familiarity with security authentication systems such as LDAP, Active Directory, OAuth2 More ❯
Science, Computer Science, or a related field. 5+ years of experience in data engineering and data quality. Strong proficiency in Python/Java, SQL, and data processing frameworks including Apache Spark. Knowledge of machine learning and its data requirements. Attention to detail and a strong commitment to data integrity. Excellent problem-solving skills and ability to work in a More ❯
london (city of london), south east england, united kingdom
Humanoid
Science, Computer Science, or a related field. 5+ years of experience in data engineering and data quality. Strong proficiency in Python/Java, SQL, and data processing frameworks including Apache Spark. Knowledge of machine learning and its data requirements. Attention to detail and a strong commitment to data integrity. Excellent problem-solving skills and ability to work in a More ❯
Science, Computer Science, or a related field. 5+ years of experience in data engineering and data quality. Strong proficiency in Python/Java, SQL, and data processing frameworks including Apache Spark. Knowledge of machine learning and its data requirements. Attention to detail and a strong commitment to data integrity. Excellent problem-solving skills and ability to work in a More ❯
big plus): Knowledge of deep learning frameworks (PyTorch, TensorFlow), transformers, or LLMs Familiarity with MLOps tools (MLflow, SageMaker, Airflow, etc.) Experience with streaming data (Kafka, Kinesis) and distributed computing (Spark, Dask) Skills in data visualization apps (Streamlit, Dash) and dashboarding (Tableau, Looker) Domain experience in forecasting, optimisation, or geospatial analytics We would like to talk to you if you More ❯
scalable pipelines, data platforms, and integrations, while ensuring solutions meet regulatory standards and align with architectural best practices. Key Responsibilities: Build and optimise scalable data pipelines using Databricks and ApacheSpark (PySpark). Ensure performance, scalability, and compliance (GxP and other standards). Collaborate on requirements, design, and backlog refinement. Promote engineering best practices including CI/CD … experience: Experience with efficient, reliable data pipelines that improve time-to-insight. Knowledge of secure, auditable, and compliant data workflows. Know how on optimising performance and reducing costs through Spark and Databricks tuning. Be able to create reusable, well-documented tools enabling collaboration across teams. A culture of engineering excellence driven by mentoring and high-quality practices. Preferred Experience … Databricks in a SaaS environment, Spark, Python, and database technologies. Event-driven and distributed systems (Kafka, AWS SNS/SQS, Java, Python). Data Governance, Data Lakehouse/Data Intelligence platforms. AI software delivery and AI data preparation. More ❯