team. In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a … make a significant impact, we encourage you to apply! Job Responsibilities ETL/ELT Pipeline Development: Design, develop, and optimize efficient and scalable ETL/ELT pipelines using Python, PySpark, and Apache Airflow. Implement batch and real-time data processing solutions using Apache Spark. Ensure data quality, governance, and security throughout the data lifecycle. Cloud Data Engineering: Manage and … and documentation. Required profile: Requirements Client facing role so strong communication and collaboration skills are vital Proven experience in data engineering, with hands-on expertise in Azure Data Services, PySpark, Apache Spark, and Apache Airflow. Strong programming skills in Python and SQL, with the ability to write efficient and maintainable code. Deep understanding of Spark internals, including RDDs, DataFrames More ❯
data engineering and reporting, including storage, data pipelines to ingest and transform data, and querying & reporting of analytical data. You've worked with technologies such as Python, Spark, SQL, Pyspark, PowerBI etc. You're a problem-solver, pragmatically exploring options and finding effective solutions. An understanding of how to design and build well-structured, maintainable systems. Strong communication skills More ❯
data engineering and reporting. Including storage, data pipelines to ingest and transform data, and querying & reporting of analytical data. You've worked with technologies such as Python, Spark, SQL, Pyspark, PowerBI etc. You're a problem-solver, pragmatically exploring options and finding effective solutions. An understanding of how to design and build well-structured, maintainable systems. Strong communication skills More ❯
UK. In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a … desire to make a significant impact, we encourage you to apply! Job Responsibilities Data Engineering & Data Pipeline Development Design, develop, and optimize scalable DATA workflows using Python, PySpark, and Airflow Implement real-time and batch data processing using Spark Enforce best practices for data quality, governance, and security throughout the data lifecycle Ensure data availability, reliability and performance through … Implement CI/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Build and optimize large-scale data processing pipelines using Apache Spark and PySpark Implement data partitioning, caching, and performance tuning for Spark-based workloads. Work with diverse data formats (structured and unstructured) to support advanced analytics and machine learning initiatives. Workflow Orchestration More ❯
Core Platform Build & Development Hands-on Implementation: Act as a lead engineer in the initial build-out of core data pipelines, ETL/ELT processes, and data models using PySpark, SQL, and Databricks notebooks. Data Ingestion & Integration: Establish scalable data ingestion frameworks from diverse sources (batch and streaming) into the Lakehouse. Performance Optimization: Design and implement solutions for optimal … Extensive experience with Azure data services (e.g., Azure Data Factory, Azure Data Lake Storage, Azure Synapse) and architecting cloud-native data platforms. Programming Proficiency: Expert-level skills in Python (PySpark) and SQL for data engineering and transformation. Scala is a strong plus. Data Modelling: Strong understanding and practical experience with data warehousing, data lake, and dimensional modelling concepts. ETL More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Osmii
Core Platform Build & Development Hands-on Implementation: Act as a lead engineer in the initial build-out of core data pipelines, ETL/ELT processes, and data models using PySpark, SQL, and Databricks notebooks. Data Ingestion & Integration: Establish scalable data ingestion frameworks from diverse sources (batch and streaming) into the Lakehouse. Performance Optimization: Design and implement solutions for optimal … Extensive experience with Azure data services (e.g., Azure Data Factory, Azure Data Lake Storage, Azure Synapse) and architecting cloud-native data platforms. Programming Proficiency: Expert-level skills in Python (PySpark) and SQL for data engineering and transformation. Scala is a strong plus. Data Modelling: Strong understanding and practical experience with data warehousing, data lake, and dimensional modelling concepts. ETL More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Osmii
Core Platform Build & Development Hands-on Implementation: Act as a lead engineer in the initial build-out of core data pipelines, ETL/ELT processes, and data models using PySpark, SQL, and Databricks notebooks. Data Ingestion & Integration: Establish scalable data ingestion frameworks from diverse sources (batch and streaming) into the Lakehouse. Performance Optimization: Design and implement solutions for optimal … Extensive experience with Azure data services (e.g., Azure Data Factory, Azure Data Lake Storage, Azure Synapse) and architecting cloud-native data platforms. Programming Proficiency: Expert-level skills in Python (PySpark) and SQL for data engineering and transformation. Scala is a strong plus. Data Modelling: Strong understanding and practical experience with data warehousing, data lake, and dimensional modelling concepts. ETL More ❯
Wakefield, Yorkshire, United Kingdom Hybrid / WFH Options
Flippa.com
Continuous integration/deployments, (CI/CD) automation, rigorous code reviews, documentation as communication. Preferred Qualifications Familiar with data manipulation and experience with Python libraries like Flask, FastAPI, Pandas, PySpark, PyTorch, to name a few. Proficiency in statistics and/or machine learning libraries like NumPy, matplotlib, seaborn, scikit-learn, etc. Experience in building ETL/ELT processes and More ❯
Coalville, Leicestershire, East Midlands, United Kingdom Hybrid / WFH Options
Ibstock PLC
consistency across the data platform. Knowledge, Skills and Experience: Essentia l Strong expertise in Databricks and Apache Spark for data engineering and analytics. Proficient in SQL and Python/PySpark for data transformation and analysis. Experience in data lakehouse development and Delta Lake optimisation. Experience with ETL/ELT processes for integrating diverse data sources. Experience in gathering, documenting More ❯
Are you passionate about revolutionising engineering with AI? Here at Monolith AI we're on a mission to empower engineers to use AI to solve even their most intractable physics problems. We've doubled in size over the last four More ❯
cooperation with our data science team Experiment in your domain to improve precision, recall, or cost savings Requirements Expert skills in Java or Python Experience with Apache Spark or PySpark Experience writing software for the cloud (AWS or GCP) Speaking and writing in English enables you to take part in day-to-day conversations in the team and contribute More ❯
Head of Data Platform and Services, you'll not only maintain and optimize our data infrastructure but also spearhead its evolution. Built predominantly on Databricks, and utilizing technologies like Pyspark and Delta Lake, our infrastructure is designed for scalability, robustness, and efficiency. You'll take charge of developing sophisticated data integrations with various advertising platforms, empowering our teams with … and informed decision-making What you'll be doing for us Leadership in Design and Development : Lead in the architecture, development, and upkeep of our Databricks-based infrastructure, harnessing Pyspark and Delta Lake. CI/CD Pipeline Mastery : Create and manage CI/CD pipelines, ensuring automated deployments and system health monitoring. Advanced Data Integration : Develop sophisticated strategies for … standards. Data-Driven Culture Champion : Advocate for the strategic use of data across the organization. Skills-wise, you'll definitely: Expertise in Apache Spark Advanced proficiency in Python and Pyspark Extensive experience with Databricks Advanced SQL knowledge Proven leadership abilities in data engineering Strong experience in building and managing CI/CD pipelines. Experience in implementing data integrations with More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Noir
Data Engineer - Leading Energy Company - London (Tech Stack: Data Engineer, Databricks, Python, PySpark, Power BI, AWS QuickSight, AWS, TSQL, ETL, Agile Methodologies) Company Overview: Join a dynamic team, a leading player in the energy sector, committed to innovation and sustainable solutions. Our client are seeking a talented Data Engineer to help build and optimise our data infrastructure, enabling them More ❯
Data Engineer - Leading Energy Company - London (Tech Stack: Data Engineer, Databricks, Python, PySpark, Power BI, AWS QuickSight, AWS, TSQL, ETL, Agile Methodologies) Company Overview: Join a dynamic team, a leading player in the energy sector, committed to innovation and sustainable solutions. Our client are seeking a talented Data Engineer to help build and optimise our data infrastructure, enabling them More ❯
Reading, Berkshire, United Kingdom Hybrid / WFH Options
Bowerford Associates
technical concepts to a range of audiences. Able to provide coaching and training to less experienced members of the team. Essential Skills: Programming Languages such as Spark, Java, Python, PySpark, Scala or similar (minimum of 2). Extensive Big Data hands-on experience across coding/configuration/automation/monitoring/security is necessary. Significant AWS or Azure … Right to Work in the UK long-term as our client is NOT offering sponsorship for this role. KEYWORDS Lead Data Engineer, Senior Lead Data Engineer, Spark, Java, Python, PySpark, Scala, Big Data, AWS, Azure, On-Prem, Cloud, ETL, Azure Data Fabric, ADF, Databricks, Azure Data, Delta Lake, Data Lake. Please note that due to a high level of More ❯
Employment Type: Permanent
Salary: £75000 - £80000/annum Pension, Good Holiday, Healthcare
City of London, London, United Kingdom Hybrid / WFH Options
Bounce Digital
from internal (Odoo/PostgreSQL) and external (eBay APIs) sources Define data quality rules, set up monitoring/logging, and support architecture decisions What You Bring Strong SQL & Python (PySpark); hands-on with GCP or AWS Experience with modern ETL tools (dbt, Airflow, Fivetran) BI experience (Looker, Power BI, Metabase); Git and basic CI/CD exposure Background in More ❯
from internal (Odoo/PostgreSQL) and external (eBay APIs) sources Define data quality rules, set up monitoring/logging, and support architecture decisions What You Bring Strong SQL & Python (PySpark); hands-on with GCP or AWS Experience with modern ETL tools (dbt, Airflow, Fivetran) BI experience (Looker, Power BI, Metabase); Git and basic CI/CD exposure Background in More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Mars
pet owners everywhere. Join us on a multi-year digital transformation journey where your work will unlock real impact. 🌟 What you'll do Build robust data pipelines using Python, PySpark, and cloud-native tools Engineer scalable data models with Databricks, Delta Lake, and Azure tech Collaborate with analysts, scientists, and fellow engineers to deliver insights Drive agile DevOps practices More ❯
pet owners everywhere. Join us on a multi-year digital transformation journey where your work will unlock real impact. 🌟 What you'll do Build robust data pipelines using Python, PySpark, and cloud-native tools Engineer scalable data models with Databricks, Delta Lake, and Azure tech Collaborate with analysts, scientists, and fellow engineers to deliver insights Drive agile DevOps practices More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Recruit with Purpose
they modernise the use of their data. Overview of responsibilities in the role: Design and maintain scalable, high-performance data pipelines using Azure Data Platform tools such as Databricks (PySpark), Data Factory, and Data Lake Gen2. Develop curated data layers (bronze, silver, gold) optimised for analytics, reporting, and AI/ML, ensuring they meet performance, governance, and reuse standards. More ❯
they modernise the use of their data. Overview of responsibilities in the role: Design and maintain scalable, high-performance data pipelines using Azure Data Platform tools such as Databricks (PySpark), Data Factory, and Data Lake Gen2. Develop curated data layers (bronze, silver, gold) optimised for analytics, reporting, and AI/ML, ensuring they meet performance, governance, and reuse standards. More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Recruit with Purpose
they modernise the use of their data. Overview of responsibilities in the role: Design and maintain scalable, high-performance data pipelines using Azure Data Platform tools such as Databricks (PySpark), Data Factory, and Data Lake Gen2. Develop curated data layers (bronze, silver, gold) optimised for analytics, reporting, and AI/ML, ensuring they meet performance, governance, and reuse standards. More ❯
data types, data structures, schemas (JSON and Spark), and schema management. Key Skills and Experience: Strong understanding of complex JSON manipulation Experience with Data Pipelines using custom Python/PySpark frameworks Knowledge of the 4 core Data categories (Reference, Master, Transactional, Freeform) and handling Reference Data Understanding of Data Security principles, access controls, GDPR, and handling sensitive datasets Strong … scripting, environment variables Experience with browser-based IDEs like Jupyter Notebooks Familiarity with Agile methodologies (SAFE, Scrum, JIRA) Languages and Frameworks: JSON YAML Python (advanced proficiency, Pydantic bonus) SQL PySpark Delta Lake Bash Git Markdown Scala (bonus) Azure SQL Server (bonus) Technologies: Azure Databricks Apache Spark Delta Tables Data processing with Python PowerBI (Data ingestion and integration) JIRA Additional More ❯
Luton, Bedfordshire, South East, United Kingdom Hybrid / WFH Options
Anson Mccade
practice Essential Experience: Proven expertise in building data warehouses and ensuring data quality on GCP Strong hands-on experience with BigQuery, Dataproc, Dataform, Composer, Pub/Sub Skilled in PySpark, Python and SQL Solid understanding of ETL/ELT processes Clear communication skills and ability to document processes effectively Desirable Skills: GCP Professional Data Engineer certification Exposure to Agentic More ❯
Integration : Develop and integrate efficient data pipelines by collecting high-quality, consistent data from external APIs and ensuring seamless incorporation into existing systems. Big Data Management and Storage : Utilize PySpark for scalable processing of large datasets, implementing best practices for distributed computing. Optimize data storage and querying within a data lake environment to enhance accessibility and performance. ML R More ❯