Data Engineer
Full time Hybrid: Leeds/London/Sheffield/
SC clearance or SC Eligible
Job summary
This is a pivotal engineering role at the heart of one of the most significant data transformation programmes in largest public service department,. You will be joining a strategic engagement programme to design, build, and operationalise a cloud-native data lakehouse on Microsoft Azure Fabric.
This is with the UK's largest public service department, serving over 22 million citizens, and this platform will directly underpin data-driven decision-making at national scale.
You will take a leading role in the design and delivery of data pipelines, data transformation layers, and lakehouse infrastructure using Microsoft Fabric, Azure Data Factory, and related Azure-native technologies.
You will work in agile squads alongside architects, analysts, and DevOps engineers, contributing to private beta builds, public beta expansion, and full platform operationalisation.
Key responsibilities
- Design, build, and optimise data pipelines using Microsoft Fabric (Data Factory, Dataflows Gen2) and Azure Data Factory to ingest data from legacy systems and third-party sources.
- Develop and maintain the Bronze, Silver, and Gold layers of the lakehouse architecture using OneLake, Delta Lake, and Apache Spark within Fabric.
- Implement data transformation logic using PySpark, SQL, and Fabric Notebooks; ensure data quality, lineage, and cataloguing via Microsoft Purview.
- Collaborate with Technical Architects and Infrastructure Engineers to support CI/CD pipelines, infrastructure-as-code, and platform automation
- Contribute to knowledge transfer workshops, running instructions, and documentation to build internal capability.
- Support governance compliance including Digital Design Authority reviews, Red Lines Assessments, and security controls.
Essential requirements
- Strong experience with Microsoft Azure Fabric (Lakehouses, Data Pipelines, Dataflows Gen2, Fabric Notebooks)
- Strong experience with Azure Data Factory (ADF) for orchestration and data movement
- Strong proficiency in PySpark, SQL, and Python for large-scale data transformation
- Strong experience with Delta Lake, Apache Spark, and OneLake architecture
- Good knowledge of Microsoft Purview for data governance, cataloguing, and lineage
- Good experience with Azure DevOps, Git-based version control, and CI/CD pipelines
- Good understanding of data lakehouse architecture (medallion architecture – Bronze/Silver/Gold)
- Good knowledge of Azure storage services (ADLS Gen2, Azure Blob Storage
Nice to have skills
- Experience with Azure Synapse Analytics or migration from Synapse to Fabric
- Familiarity with Databricks or equivalent distributed processing platforms
- Experience in UK public sector or government data environments
- Understanding of SC clearance requirements and government security classifications
- Knowledge of DDAT frameworks and GDS delivery standards
Qualifications
- Relevant degree in Computer Science, Data Engineering, or related discipline (or equivalent experience)
- Microsoft Certified: Azure Data Engineer Associate (DP-203) – desirable
- Microsoft Fabric Analytics Engineer (DP-600) – desirable