Data Engineer

Full time Hybrid: Leeds/London/Sheffield/

SC clearance or SC Eligible

Job summary

This is a pivotal engineering role at the heart of one of the most significant data transformation programmes in largest public service department,. You will be joining a strategic engagement programme to design, build, and operationalise a cloud-native data lakehouse on Microsoft Azure Fabric.

This is with the UK's largest public service department, serving over 22 million citizens, and this platform will directly underpin data-driven decision-making at national scale.

You will take a leading role in the design and delivery of data pipelines, data transformation layers, and lakehouse infrastructure using Microsoft Fabric, Azure Data Factory, and related Azure-native technologies.

You will work in agile squads alongside architects, analysts, and DevOps engineers, contributing to private beta builds, public beta expansion, and full platform operationalisation.

Key responsibilities

Design, build, and optimise data pipelines using Microsoft Fabric (Data Factory, Dataflows Gen2) and Azure Data Factory to ingest data from legacy systems and third-party sources.
Develop and maintain the Bronze, Silver, and Gold layers of the lakehouse architecture using OneLake, Delta Lake, and Apache Spark within Fabric.
Implement data transformation logic using PySpark, SQL, and Fabric Notebooks; ensure data quality, lineage, and cataloguing via Microsoft Purview.
Collaborate with Technical Architects and Infrastructure Engineers to support CI/CD pipelines, infrastructure-as-code, and platform automation
Contribute to knowledge transfer workshops, running instructions, and documentation to build internal capability.
Support governance compliance including Digital Design Authority reviews, Red Lines Assessments, and security controls.

Essential requirements

Strong experience with Microsoft Azure Fabric (Lakehouses, Data Pipelines, Dataflows Gen2, Fabric Notebooks)
Strong experience with Azure Data Factory (ADF) for orchestration and data movement
Strong proficiency in PySpark, SQL, and Python for large-scale data transformation
Strong experience with Delta Lake, Apache Spark, and OneLake architecture
Good knowledge of Microsoft Purview for data governance, cataloguing, and lineage
Good experience with Azure DevOps, Git-based version control, and CI/CD pipelines
Good understanding of data lakehouse architecture (medallion architecture – Bronze/Silver/Gold)
Good knowledge of Azure storage services (ADLS Gen2, Azure Blob Storage

Nice to have skills

Experience with Azure Synapse Analytics or migration from Synapse to Fabric
Familiarity with Databricks or equivalent distributed processing platforms
Experience in UK public sector or government data environments
Understanding of SC clearance requirements and government security classifications
Knowledge of DDAT frameworks and GDS delivery standards

Qualifications

Relevant degree in Computer Science, Data Engineering, or related discipline (or equivalent experience)
Microsoft Certified: Azure Data Engineer Associate (DP-203) – desirable
Microsoft Fabric Analytics Engineer (DP-600) – desirable

Apply Now

Data Engineer

Job Details