Data Engineer
Mid-Level Data Engineer (Azure / Databricks)
What You'll Do
Lakehouse Engineering (Azure + Databricks)
Curated Layers & Data Modelling
Orchestration & Observability
DevOps & Platform Engineering
Collaboration & Delivery
Tech You'll Use
What We're Looking For
Experience
Mindset
Nice to Have
Why Join?
NO VISA REQUIREMENTS
Location: Glasgow (3+ days)
Reports to: Head of IT
My client is undergoing a major transformation of their entire data landscape-migrating from legacy systems and manual reporting into a modern Azure + Databricks Lakehouse. They are building a secure, automated, enterprise-grade platform powered by Lakeflow Declarative Pipelines, Unity Catalog and Azure Data Factory.
They are looking for a Mid-Level Data Engineer to help deliver high-quality pipelines and curated datasets used across Finance, Operations, Sales, Customer Care and Logistics.
What You'll Do
Lakehouse Engineering (Azure + Databricks)
Build and maintain scalable ELT pipelines using Lakeflow Declarative Pipelines, PySpark and Spark SQL.
Work within a Medallion architecture (Bronze ? Silver ? Gold) to deliver reliable, high-quality datasets.
Ingest data from multiple sources including ChargeBee, legacy operational files, SharePoint, SFTP, SQL, REST and GraphQL APIs using Azure Data Factory and metadata-driven patterns.
Apply data quality and validation rules using Lakeflow Declarative Pipelines expectations.
Curated Layers & Data Modelling
Develop clean and conforming Silver & Gold layers aligned to enterprise subject areas.
Contribute to dimensional modelling (star schemas), harmonisation logic, SCDs and business marts powering Power BI datasets.
Apply governance, lineage and permissioning through Unity Catalog.
Orchestration & Observability
Use Lakeflow Workflows and ADF to orchestrate and optimise ingestion, transformation and scheduled jobs.
Help implement monitoring, alerting, SLAs/SLIs and runbooks to support production reliability.
Assist in performance tuning and cost optimisation.
DevOps & Platform Engineering
Contribute to CI/CD pipelines in Azure DevOps to automate deployment of notebooks, Lakeflow Declarative Pipelines, SQL models and ADF assets.
Support secure deployment patterns using private endpoints, managed identities and Key Vault.
Participate in code reviews and help improve engineering practices.
Collaboration & Delivery
Work with BI and Analytics teams to deliver curated datasets that power dashboards across the business.
Contribute to architectural discussions and the ongoing data platform roadmap.
Tech You'll Use
Databricks: Lakeflow Declarative Pipelines, Lakeflow Workflows, Unity Catalog, Delta Lake
Azure: ADLS Gen2, Data Factory, Event Hubs (optional), Key Vault, private endpoints
Languages: PySpark, Spark SQL, Python, Git
DevOps: Azure DevOps Repos & Pipelines, CI/CD
Analytics: Power BI, Fabric
What We're Looking For
Experience
Commercial and proven data engineering experience.
Hands-on experience delivering solutions on Azure + Databricks.
Strong PySpark and Spark SQL skills within distributed compute environments.
Experience working in a Lakehouse/Medallion architecture with Delta Lake.
Understanding of dimensional modelling (Kimball), including SCD Type 1/2.
Exposure to operational concepts such as monitoring, retries, idempotency and backfills.
Mindset
Keen to grow within a modern Azure Data Platform environment.
Comfortable with Git, CI/CD and modern engineering workflows.
Able to communicate technical concepts clearly to non-technical stakeholders.
Quality-driven, collaborative and proactive.
Nice to Have
Databricks Certified Data Engineer Associate.
Experience with streaming ingestion (Auto Loader, event streams, watermarking).
Subscription/entitlement modelling (e.g., ChargeBee).
Unity Catalog advanced security (RLS, PII governance).
Terraform or Bicep for IaC.
Fabric Semantic Models or Direct Lake optimisation experience.
Why Join?
Opportunity to shape and build a modern enterprise Lakehouse platform.
Hands-on work with Azure, Databricks and leading-edge engineering practices.
Real progression opportunities within a growing data function.
Direct impact across multiple business domains.