be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, ApacheSpark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a … to apply! Job Responsibilities ETL/ELT Pipeline Development: Design, develop, and optimize efficient and scalable ETL/ELT pipelines using Python, PySpark, and Apache Airflow. Implement batch and real-time data processing solutions using Apache Spark. Ensure data quality, governance, and security throughout the data lifecycle. Cloud …/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Develop and optimize large-scale data processing pipelines using ApacheSpark and PySpark. Implement data partitioning, caching, and performance tuning techniques to enhance Spark-based workloads. Work with diverse data formats (structured More ❯
London, England, United Kingdom Hybrid / WFH Options
Biprocsi Ltd
automation to ensure successful project delivery, adhering to client timelines and quality standards. Implement and manage real-time and batch data processing frameworks (e.g., Apache Kafka, ApacheSpark, Google Cloud Dataproc) in line with project needs. Build and maintain robust monitoring, logging, and alerting systems for client … in languages like Python, Bash, or Go to automate tasks and build necessary tools. Expertise in designing and optimising data pipelines using frameworks like Apache Airflow or equivalent. Demonstrated experience with real-time and batch data processing frameworks, including Apache Kafka, ApacheSpark, or Google Cloud More ❯
be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, ApacheSpark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a … Engineering & Data Pipeline Development Design, develop, and optimize scalable DATA workflows using Python, PySpark, and Airflow Implement real-time and batch data processing using Spark Enforce best practices for data quality, governance, and security throughout the data lifecycle Ensure data availability, reliability and performance through monitoring and automation. Cloud …/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Build and optimize large-scale data processing pipelines using ApacheSpark and PySpark Implement data partitioning, caching, and performance tuning for Spark-based workloads. Work with diverse data formats (structured and unstructured More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
London, England, United Kingdom Hybrid / WFH Options
Endava
Key Responsibilities Data Pipeline Development Architect, implement and maintain real-time and batch data pipelines to handle large datasets efficiently. Employ frameworks such as ApacheSpark, Databricks, Snowflake or Airflow to automate ingestion, transformation, and delivery. Data Integration & Transformation Work with Data Analysts to understand source-to-target … ensure regulatory compliance (GDPR). Document data lineage and recommend improvements for data ownership and stewardship. Qualifications Programming: Python, SQL, Scala, Java. Big Data: ApacheSpark, Hadoop, Databricks, Snowflake, etc. Cloud: AWS (Glue, Redshift), Azure (Synapse, Data Factory, Fabric), GCP (BigQuery, Dataflow). Data Modelling & Storage: Relational (PostgreSQL More ❯
London, England, United Kingdom Hybrid / WFH Options
Datapao
companies where years-long behemoth projects are the norm, our projects are fast-paced, typically 2 to 4 months long. Most are delivered using ApacheSpark/Databricks on AWS/Azure and require you to directly manage the customer relationship alone or in collaboration with a Project … at DATAPAO, meaning that you'll get access to Databricks' public and internal courses to learn all the tricks of Distributed Data Processing, MLOps, ApacheSpark, Databricks, and Cloud Migration from the best. Additionally, we'll pay for various data & cloud certifications, you'll get dedicated time for … seniority level during the selection process. About DATAPAO At DATAPAO, we are delivery partners and the preferred training provider for Databricks, the creators of Apache Spark. Additionally, we are Microsoft Gold Partners in delivering cloud migration and data architecture on Azure. Our delivery partnerships enable us to work in More ❯
London, England, United Kingdom Hybrid / WFH Options
Luupli
analytical, problem-solving, and critical thinking skills. 8.Experience with social media analytics and understanding of user behaviour. 9.Familiarity with big data technologies, such as Apache Hadoop, ApacheSpark, or Apache Kafka. 10.Knowledge of AWS machine learning services, such as Amazon SageMaker and Amazon Comprehend. 11.Experience with More ❯
London, England, United Kingdom Hybrid / WFH Options
bigspark
Engineer - UK Remote About Us bigspark, a UK based consultancy delivering next level data platforms and solutions with a focus on exciting technologies including ApacheSpark, Apache Kafka and working on projects within Machine Learning, Data Engineering, Streaming and Data Science is looking for a Python Software More ❯
London, England, United Kingdom Hybrid / WFH Options
Endava Limited
with business objectives. Key Responsibilities Architect, implement, and maintain real-time and batch data pipelines to handle large datasets efficiently. Employ frameworks such as ApacheSpark, Databricks, Snowflake, or Airflow to automate ingestion, transformation, and delivery. Data Integration & Transformation Work with Data Analysts to understand source-to-target … ensure regulatory compliance (GDPR). Document data lineage and recommend improvements for data ownership and stewardship. Qualifications Programming: Python, SQL, Scala, Java. Big Data: ApacheSpark, Hadoop, Databricks, Snowflake, etc. Data Modelling: Designing dimensional, relational, and hierarchical data models. Scalability & Performance: Building fault-tolerant, highly available data architectures. More ❯
London, England, United Kingdom Hybrid / WFH Options
DEPOP
platform teams at scale, ideally in a consumer or marketplace environment. Deep understanding of distributed systems and modern data ecosystems - including experience with Databrick, ApacheSpark, Apache Kafka and DBT. Demonstrated success in managing data platforms at scale, including both batch processing and real-time streaming architectures. More ❯
driving business value through ML Company first focus and collaborative individuals - we work better when we work together. Preferred Experience working with Databricks and ApacheSpark Preferred Experience working in a customer-facing role About Databricks Databricks is the data and AI company. More than 10,000 organizations … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
London, England, United Kingdom Hybrid / WFH Options
Apollo Solutions
manipulation and analysis, with the ability to build, maintain, and deploy sequences of automated processes Bonus Experience (Nice to Have) Familiarity with dbt, Fivetran, Apache Airflow, Data Mesh, Data Vault 2.0, Fabric, and ApacheSpark Experience working with streaming technologies such as Apache Kafka, ApacheMore ❯
Newcastle upon Tyne, England, United Kingdom Hybrid / WFH Options
Gaming Innovation Group
at: Object oriented programming (Java) Data modelling using any database technologies ETL processes (ETLs are oldschool, we transfer in memory now) and experience with ApacheSpark or Apache NiFi Applied understanding of CI\CD in change management Dockerised applications Used distributed version control systems Excellent team player More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Gaming Innovation Group
Object-oriented programming (Java) Data modeling using various database technologies ETL processes (transferring data in-memory, moving away from traditional ETLs) and experience with ApacheSpark or Apache NiFi Applied understanding of CI/CD in change management Dockerized applications Using distributed version control systems Being an More ❯
London, England, United Kingdom Hybrid / WFH Options
SBS
modelling, design, and integration expertise. Data Mesh Architectures: In-depth understanding of data mesh architectures. Technical Proficiency: Proficient in dbt, SQL, Python/Java, ApacheSpark, Trino, Apache Airflow, and Astro. Cloud Technologies: Awareness and experience with cloud technologies, particularly AWS. Analytical Skills: Excellent problem-solving and More ❯
City of Westminster, England, United Kingdom Hybrid / WFH Options
nudge Global Ltd
with cloud data platforms such as GCP (BigQuery, Dataflow) or Azure (Data Factory, Synapse) Expert in SQL, MongoDB and distributed data systems such as Spark, Databricks or Kafka Familiarity with data warehousing concepts and tools (e.g. Snowflake) Experience with CI/CD pipelines, containerization (Docker), and infrastructure-as-code More ❯
London, England, United Kingdom Hybrid / WFH Options
nudge
with cloud data platforms such as GCP (BigQuery, Dataflow) or Azure (Data Factory, Synapse) Expert in SQL, MongoDB and distributed data systems such as Spark, Databricks or Kafka Familiarity with data warehousing concepts and tools (e.g. Snowflake) Experience with CI/CD pipelines, containerization (Docker), and infrastructure-as-code More ❯
London, England, United Kingdom Hybrid / WFH Options
Locus Robotics
and scaling data systems. Highly desired experience with Azure, particularly Lakehouse and Eventhouse architectures. Experience with relevant infrastructure and tools including NATS, Power BI, ApacheSpark/Databricks, and PySpark. Hands-on experience with data warehousing methodologies and optimization libraries (e.g., OR-Tools). Experience with log analysis More ❯
London, England, United Kingdom Hybrid / WFH Options
DATAPAO
most complex projects - individually or by leading small delivery teams. Our projects are fast-paced, typically 2 to 4 months long, and primarily use ApacheSpark/Databricks on AWS/Azure. You will manage customer relationships either alone or with a Project Manager, and support our pre More ❯
London, England, United Kingdom Hybrid / WFH Options
Aimpoint Digital
industries Design and develop feature engineering pipelines, build ML & AI infrastructure, deploy models, and orchestrate advanced analytical insights Write code in SQL, Python, and Spark following software engineering best practices Collaborate with stakeholders and customers to ensure successful project delivery Who we are looking for We are looking for More ❯
London, England, United Kingdom Hybrid / WFH Options
Cloudera
Data Engineering product area. This next-generation cloud-native service empowers customers to run large-scale data engineering workflows—using industry-standard tools like ApacheSpark and Apache Airflow—with just a few clicks, across both on-premises and public cloud environments. You'll play a critical … lead their own teams across multiple time zones Oversee a global team, many of whom are active contributors to open source communities like the Apache Software Foundation Own both technical direction and people management within the team Ensure consistent, high-quality software delivery through iterative releases Hire, manage, coach More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Staging It
modelling (relational, NoSQL) and ETL/ELT processes. Experience with data integration tools (e.g., Kafka, Talend) and APIs. Familiarity with big data technologies (Hadoop, Spark) and real-time streaming. Expertise in cloud security, data governance, and compliance (GDPR, HIPAA). Strong SQL skills and proficiency in at least one More ❯
delivery across a range of projects, including data analysis, extraction, transformation, and loading, data intelligence, data security and proven experience in their technologies (e.g. Spark, cloud-based ETL services, Python, Kafka, SQL, Airflow) You have experience in assessing the relevant data quality issues based on data sources & uses cases More ❯
team-oriented environment. Preferred Skills: Experience with programming languages such as Python or R for data analysis. Knowledge of big data technologies (e.g., Hadoop, Spark) and data warehousing concepts. Familiarity with cloud data platforms (e.g., Azure, AWS, Google Cloud) is a plus. Certification in BI tools, SQL, or related More ❯