scalable data pipelines using PySpark 3/4 and Python 3. * Contribute to the creation of a unified data lake following medallion architecture principles. * Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing. * Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage. * Collaborate with cross-functional teams and participate in Agile More ❯
Job Description AWS Stack, data being landed in S3, Lambda triggers, Data Quality, data written back out to AWS S3(Parquet Formats), Snowflake for dimensional model. Design and build the data pipelines, work with someone around understanding data transformation, this is supported by BA's, building out the data pipelines, moving into layers in the data architecture (Medallion architecture More ❯
Bonus Points For Workflow orchestration tools like Airflow. Working knowledge of Kafka and Kafka Connect. Experience with Delta Lake and lakehouse architectures. Proficiency in data serialization formats: JSON, XML, PARQUET, YAML. Cloud-based data services experience. Ready to build the future of data? If you're a collaborative, forward-thinking engineer who wants to work on meaningful, complex problems More ❯