Data Architect

Apply Now

Databricks Architect (Contract) - Greenfield Data Platform

Location: Hybrid working (London)

Duration: 12-month initial contract

Are you a visionary Databricks Architect with a passion for building cutting-edge data platforms from the ground up? Do you thrive on shaping strategy and driving technical excellence in a greenfield environment?

Our client is embarking on a pivotal journey to establish a brand-new, enterprise-grade data platform using the full power of Databricks. This is a unique opportunity to lead the architectural design and implementation of a truly greenfield data ecosystem that will underpin all future data-driven initiatives, from advanced analytics to AI/ML.

We are looking for a hands-on architect who can translate business needs into robust, scalable, and secure Databricks solutions.

The Role:

As our Databricks Architect, you will be instrumental in defining and delivering our new data strategy and architecture. This is a greenfield project, meaning you'll have the exciting challenge of building the entire Databricks Lakehouse Platform from scratch. You will provide critical technical leadership, guidance, and hands-on expertise to ensure the successful establishment of a scalable, high-performance, and future-proof data environment.

Phase 1: Strategic Vision & Blueprint

Data Strategy & Roadmap: Collaborate with business stakeholders and leadership to define the overarching data vision, strategy, and a phased roadmap for the Databricks Lakehouse Platform.
Architectural Design: Lead the end-to-end design of the Databricks Lakehouse architecture (Medallion architecture), including data ingestion patterns, storage layers (Delta Lake), processing frameworks (Spark), and consumption mechanisms.
Technology Selection: Evaluate and recommend optimal Databricks features and integrations (e.g., Unity Catalog, Photon, Delta Live Tables, MLflow) and complementary cloud services (e.g., Azure Data Factory, Azure Data Lake Storage, Power BI).
Security & Governance Frameworks: Design robust data governance, security, and access control models within the Databricks ecosystem, ensuring compliance with industry standards and regulations.

Phase 2: Core Platform Build & Development

Hands-on Implementation: Act as a lead engineer in the initial build-out of core data pipelines, ETL/ELT processes, and data models using PySpark, SQL, and Databricks notebooks.
Data Ingestion & Integration: Establish scalable data ingestion frameworks from diverse sources (batch and streaming) into the Lakehouse.
Performance Optimization: Design and implement solutions for optimal data processing performance, cost efficiency, and scalability within Databricks.
CI/CD & Automation: Develop and implement Continuous Integration/Continuous Delivery (CI/CD) pipelines for automated deployment of Databricks assets and data solutions.

Phase 3: Enablement, Optimisation & Transition

Team Enablement: Provide mentorship and technical guidance to a growing team of Data Engineers and Analysts, fostering best practices and Databricks expertise.
Data Quality & Monitoring: Implement comprehensive data quality checks, monitoring, and alerting mechanisms to ensure data integrity and reliability.
MLOps Integration: Lay the groundwork for seamless integration with Machine Learning Operations (MLOps) capabilities for future AI initiatives.
Documentation & Knowledge Transfer: Create comprehensive technical documentation and conduct knowledge transfer sessions to ensure long-term sustainability of the platform.

Required Skills & Experience

Proven Databricks Expertise: Deep, hands-on experience designing and implementing solutions on the Databricks Lakehouse Platform (Delta Lake, Unity Catalog, Spark, Databricks SQL Analytics).
Cloud Data Architecture: Extensive experience with Azure data services (e.g., Azure Data Factory, Azure Data Lake Storage, Azure Synapse) and architecting cloud-native data platforms.
Programming Proficiency: Expert-level skills in Python (PySpark) and SQL for data engineering and transformation. Scala is a strong plus.
Data Modelling: Strong understanding and practical experience with data warehousing, data lake, and dimensional modelling concepts.
ETL/ELT & Data Pipelines: Proven track record of designing, building, and optimizing complex data pipelines for both batch and real-time processing.

Desirable Skills & Certifications

Databricks Certified Data Engineer Associate/Professional.
Microsoft Certified: Azure Data Engineer Associate (DP-203) or Azure Solutions Architect Expert (AZ-305/304).
Experience with other cloud providers (AWS, GCP).
Knowledge of streaming technologies (Kafka, Event Hubs).

Company: Osmii
Location: City of London, Greater London, UK
Hybrid / WFH Options
Posted: 2 days ago

Apply Now

Company: Osmii
Location: City of London, Greater London, UK
Hybrid / WFH Options
Posted: 2 days ago