Developer (PySpark+Fabric)

Role- Developer (PySpark+Fabric)

Contract- 6 months with possible extension

Location - London (hybrid)

The Role:

  • The role will be integral to realizing the customer's vision and strategy in transforming some of their critical application and data engineering components. As a global financial market's infrastructure and data provider, the customer keeps abreast of the latest cutting technologies enabling their core services and business requirements. The role is critical in this endeavour by the means of enabling the technical thought leadership and excellence required for the purpose.

Your responsibilities:

  • Design, build, and optimise scalable data pipelines for batch and streaming workloads
  • Develop and manage dataflows, and semantic models to support specific analytics related business requirements
  • Implement complex transformations, aggregations, and joins ensuring performance, and reliability
  • Implement and apply robust data validations, cleansing, and profiling techniques to ensure data accuracy and consistency across datasets
  • Implement role-based access, data masking, and compliance protocols
  • Performance tune and optimise jobs and workloads to reduce latency
  • Work collaboratively with analysts and business stakeholders to translate requirements into technical solutions
  • Create, maintain, and update documentation and internal knowledge repository

Essential skills/knowledge/experience:

  • Experience of programming under Microsoft Azure Cloud Platform
  • Experience of programming under Microsoft Fabric Platform
  • Have knowledge of Spark Programming Ability to write Spark code for large scale data processing, including RDDs, DataFrames, and Spark SQL
  • Python/Notebook programming
  • PySpark programming
  • Spark Streaming/batch processing
  • Delta table Optimization
  • Fabric Spark jobs
  • Java programming language, OOP knowledge
  • Database knowledge, including Relational Database and Non-SQL database.
  • Experience of using the tools: Gitlab, Python unit test, CICD pipeline.
  • Good skill of troubleshooting
  • Familiar with the Agile.
  • Good English listening and speaking for communicating requirements and development tasks/issues
  • Hands-on experience with lakehouses, dataflows, pipelines, and semantic models
  • Ability to build ETL workflows
  • Familiarity with time-series data, market feeds, transactional records, and risk metrics
  • Familiarity with Git, DevOps pipelines, and automated deployment
  • Strong communication skills with a collaborative mindset to work with and manage stakeholders

Desirable skills/knowledge/experience:

  • Ability to prepare and process datasets for Power BI usage
  • Experience with OneLake, Azure Data Lake, and distributed computing environments
  • Understanding of financial regulations such as GDPR, SOX etc.
  • Spark application performance tuning
  • Knowledge of Docker/Kubernetes
Company
Vallum
Location
London, United Kingdom
Hybrid/Remote Options
Employment Type
Contract
Salary
GBP Annual
Posted
Company
Vallum
Location
London, United Kingdom
Hybrid/Remote Options
Employment Type
Contract
Salary
GBP Annual
Posted