self-teach new tools and technologies; an ability to educate others Familiarity with the Linux command line and shell scripting Understanding of basic data concepts - file formats (e.g. CSV, Parquet), data pipelines, and storage layers Exposure to containers and Podman or Docker Comfortable using Git; an awareness of CI/CD practices and tools such as GitHub Actions or More ❯
optimising performance across SQL Server, PostgreSQL, and cloud databases Proven track record with complex data migration projects (terabyte+ datasets, multiple legacy source systems, structures and unstructured data) Proficiency with Parquet/Delta Lake or other modern data storage formats Experience with streaming architectures using Kafka, Event Hubs, or Kinesis for real-time data processing Knowledge of data architectures supporting More ❯
Familiarity with C++ is a plus Bonus++: Experience working with quants, data scientists, or in trading environments Tech Stack: Python 3.11+ (fast, modern, typed) Dask, pandas, PyArrow, NumPy PostgreSQL, Parquet, S3 Airflow, Docker, Kubernetes, GitLab CI Internal frameworks built for scale and speed Why Join: Engineers own projects end-to-end—from design to deployment to impact Work side More ❯
our datalake platform Kubernetes for data services and task orchestration Terraform for infrastructure Streamlit for data applications Airflow purely for job scheduling and tracking Circle CI for continuous deployment Parquet and Delta file formats on S3 for data lake storage Spark for data processing DBT for data modelling SparkSQL for analytics Why else you'll love it here Salary More ❯