sFTP protocols. ETL/ELT Pipelines : Design and optimize data pipelines using Azure Data Factory and Databricks Medallion Architecture : Implement Bronze, Silver, and Gold layers using formats like Delta , Parquet , and JSON for data transformation. Data Modeling : Develop and optimize data models using star schema and slowly changing dimensions for analytics and operations. Data Governance : Ensure robust data security … Azure Data Engineer. Technical Expertise : Proficiency with Azure Data Factory , Databricks and Azure Storage . Strong skills in SQL , Python , and data modeling techniques. Familiarity with data formats like Parquet and JSON. Experience with AI/ML model management on Azure Databricks . Education : Bachelor's degree in IT, Computer Science, or a related field. Microsoft Certified: Azure Data More ❯
tools (QuickSight, Power BI, Tableau, Looker, etc.) Interest or experience in building internal data communities or enablement programs Working with diverse data sources (APIs, CRMs, SFTP, databases) and formats (Parquet, JSON, XML, CSV) Exposure to machine learning models or AI agents Why Join Us Help shape the future of data in an organization that treats data as a product More ❯
with SQL databases (PostgreSQL, Oracle, SQL Server) Knowledge of big data technologies (Hadoop, Spark, Kafka) Familiarity with cloud platforms and containerization (Docker, Kubernetes) Understanding of data formats (JSON, XML, Parquet, Avro) Professional Experience Bachelor's degree in Computer Science, Engineering, or related field 5+ years of experience in data engineering or related roles Experience working in classified or high More ❯
Herndon, Virginia, United States Hybrid / WFH Options
Maxar Technologies Holdings Inc
Minimum of 3 years' experience with Python. Demonstrated experience building & orchestrating automated, production-level data pipelines and solutions (ETL/ELT). Experience with file-based data storage, including Parquet or Iceberg. Experience with data catalogs (ex. Hive, AWS Glue). General understanding of key AWS services (e.g. EC2, S3, EKS, IAM, lambda). Experience building and/or More ❯
All our office locations considered: Newbury & Liverpool (UK); Šibenik, Croatia (considered) We're on the hunt for builders . No, we've not ventured into construction in our quest to conquer the world, rather a designer and builder of systems More ❯
services, especially Glue, Athena, Lambda, and S3 . Proficient in Python (ideally PySpark) and modular SQL for transformations and orchestration. Solid grasp of data modeling (partitioning, file formats like Parquet, etc.). Comfort with CI/CD, version control, and infrastructure-as-code tools. If this sends like you then send your CV More ❯
real-time streaming applications preferably with experience in Kafka Real-time messaging or Azure Stream Analytics/Event Hub. Spark processing and performance tuning. File formats partitioning for e.g. Parquet, JSON, XML, CSV. Azure DevOps, GitHub actions. Hands-on experience in at least one of Python with knowledge of the others. Experience in Data modeling. Experience of synchronous and More ❯
Reading, England, United Kingdom Hybrid / WFH Options
Areti Group | B Corp™
Expert knowledge of the Microsoft Fabric Analytics Platform (Azure SQL, Synapse, PowerBI). • Proficient in Python for data engineering tasks, including data ingestion from APIs, creation and management of Parquet files, and execution of ML models. • Strong SQL skills, enabling support for Data Analysts with efficient and performant queries. • Skilled in optimizing data ingestion and query performance for MSSQL More ❯
hands-on AWS experience – S3, Redshift, Glue essential. Proven experience building ETL/ELT pipelines in cloud environments. Proficient in working with structured/unstructured data (JSON, XML, CSV, Parquet). Skilled in working with relational databases and data lake architectures. Experienced with Matillion and modern data visualisation tools (QuickSight, Tableau, Looker, etc.). Strong scripting and Linux/ More ❯
London, England, United Kingdom Hybrid / WFH Options
Anson McCade
hands-on AWS experience – S3, Redshift, Glue essential. Proven experience building ETL/ELT pipelines in cloud environments. Proficient in working with structured/unstructured data (JSON, XML, CSV, Parquet). Skilled in working with relational databases and data lake architectures. Experienced with Matillion and modern data visualisation tools (QuickSight, Tableau, Looker, etc.). Strong scripting and Linux/ More ❯
and Matillion Translate client requirements into scalable and secure data architectures Drive infrastructure-as-code and CI/CD deployment practices Process structured and semi-structured data (JSON, XML, Parquet, CSV) Maintain metadata, build data dictionaries, and ensure governance is embedded by design Work across industries in fast-paced, high-value engagements This Principal Data Engineer will bring: Extensive More ❯
and Matillion Translate client requirements into scalable and secure data architectures Drive infrastructure-as-code and CI/CD deployment practices Process structured and semi-structured data (JSON, XML, Parquet, CSV) Maintain metadata, build data dictionaries, and ensure governance is embedded by design Work across industries in fast-paced, high-value engagements This Principal Data Engineer will bring: Extensive More ❯
AWS serverless services and enables powerful querying and analytics through Amazon Athena. In this role, you'll work on a system that combines streaming ingestion (Firehose), data lake technologies (Parquet, Apache Iceberg), scalable storage (S3), event-driven processing (Lambda, EventBridge), fast access databases (DynamoDB), and robust APIs (Spring Boot microservices on EC2). Your role will involve designing, implementing … processing pipeline and platform services. Key Responsibilities: Design, build, and maintain serverless data processing pipelines using AWS Lambda, Firehose, S3, and Athena. Optimize data storage and querying performance using Parquet and Iceberg formats. Manage and scale event-driven workflows using EventBridge and Lambda. Work with DynamoDB for fast, scalable key-value storage. Develop and maintain Java Spring Boot microservices … Java backend development experience. 3+ years of Python development. Strong hands-on experience with AWS services: Lambda, S3, K8S. Deep understanding of data lake architectures and formats such as Parquet and Iceberg. Proficiency in Spring Boot and working experience with microservices. Experience with high-scale, event-driven systems and serverless patterns. Nice to Have: Solid understanding of distributed systems More ❯
data warehousing (e.g. Hadoop, Spark, Redshift, Snowflake, GCP BigQuery) Expertise in building data architectures that support batch and streaming paradigms Experience with standards such as JSON, XML, YAML, Avro, Parquet Strong communication skills Open to learning new technologies, methodologies, and skills As the successful Data Engineering Manager you will be responsible for: Building and maintaining data pipelines Identifying and More ❯
management systems . Analyze and cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a collaborative, knowledge-sharing team culture. More ❯
management systems . Analyse and cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a collaborative, knowledge-sharing team culture. More ❯
management systems . Analyze and cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a collaborative, knowledge-sharing team culture. More ❯
management systems . Analyze and cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a collaborative, knowledge-sharing team culture. More ❯
Cardiff, South Glamorgan, United Kingdom Hybrid / WFH Options
RVU Co UK
Experience with alternative data technologies (e.g. duckdb, polars, daft). Familiarity with eventing technologies (Event Hubs, Kafka etc ). Deep understanding of file formats and their behaviour such as parquet, delta and iceberg. What we offer We want to give you a great work environment; contribute back to both your personal and professional development; and give you great benefits More ❯
data. Architect and build data pipelines which fetch data from public and private data suppliers' APIs, S3 buckets, and web interfaces in various formats (e.g., JSON, CSV, Excel, PDF, Parquet), join geographical shapes with data from multiple sources, and perform various transformations. Create programmatically validated data schemas, as well as human-readable documentation, to specify requirements to our partners. More ❯
Java Experience with Big Data streaming platforms including Spark Experience with deploying and managing Jupyter Notebook environments Experience with data parsing/transformation technologies including JSON, XMl, CSV, and Parquet formats Experience with stream/batch Big Data processing and analytic frameworks Experience with CI/CD principles, methodologies, and tools such as GitLab CI Experience with IaC (Infrastructure More ❯
we do Passion for data and experience working within a data driven organization Hands-on experience with architecting, implementing, and performance tuning of: Data Lake technologies (e.g. Delta Lake, Parquet, Spark, Databricks) API & Microservices Message queues, streaming technologies, and event driven architecture NoSQL databases and query languages Data domain and event data models Data Modelling Logging and monitoring Container More ❯
software, libraries, and packages involving stream/batch data processing and analytic frameworks Experience with data parsing/transformation technologies and file formats including JSON, XMl, CSV, TCLD, and Parquet General Cloud and HPC knowledge regarding computer, networking, memory, and storage components Experience with Linux administration including software integration, configuration management and routine O&M operations related to provisioning More ❯
tools such as Spark, NiFi, Kafka, Flink, or at multi-petabyte scale Experience in designing and maintaining ETL or ELT data pipelines utilizing storage, serialization formats, schemas, such as Parquet and Avro Experience administrating and maintaining data science workspaces and tool benches for Data Scientists and Analysts Secret clearance HS diploma or GED Nice If You Have: Experience deploying More ❯