Data & AI - LLM Model Developer(PySpark Engineer)

Lead PySpark Engineer (Cloud Migration)

Role Type: 5-Month Contract

Location: Remote (UK-Based)

Experience Level: Lead / Senior (5+ years PySpark)

Role Overview

We are seeking a Lead PySpark Engineer to drive a large-scale data modernisation project, transitioning legacy data workflows into a high-performance AWS cloud environment. This is a hands-on technical role focused on converting legacy SAS code into production-ready PySpark pipelines within a complex financial services landscape.

Key Responsibilities

  • Code Conversion: Lead the end-to-end migration of SAS code (Base SAS, Macros, DI Studio) to PySpark using automated tools (SAS2PY) and manual refactoring.
  • Pipeline Engineering: Design, build, and troubleshoot complex ETL/ELT workflows and data marts on AWS.
  • Performance Tuning: Optimise Spark workloads for execution efficiency, partitioning, and cost-effectiveness.
  • Quality Assurance: Implement clean coding principles, modular design, and robust unit/comparative testing to ensure data accuracy throughout the migration.
  • Engineering Excellence: Maintain Git-based workflows, CI/CD integration, and comprehensive technical documentation.

Technical Requirements

  • PySpark (P3): 5+ years of hands-on experience writing scalable, production-grade PySpark/Spark SQL.
  • AWS Data Stack (P3): Strong proficiency in EMR, Glue, S3, Athena, and Glue Workflows.
  • SAS Knowledge (P1): Solid foundation in SAS to enable the understanding and debugging of legacy logic for conversion.
  • Data Modeling: Expertise in ETL/ELT, dimensions, facts, SCDs, and data mart architecture.
  • Engineering Quality: Experience with parameterisation, exception handling, and modular Python design.

Additional Details

  • Industry: Financial Services experience is highly desirable.
  • Working Pattern: Fully remote with internal team collaboration days.
  • Benefits: 33 days holiday entitlement (pro-rata).

Job Details

Company
Randstad Digital
Location
United Kingdom
Hybrid / Remote Options
Posted