Looker, etc.) Interest or experience in building internal data communities or enablement programs Working with diverse data sources (APIs, CRMs, SFTP, databases) and formats (Parquet, JSON, XML, CSV) Exposure to machine learning models or AI agents Why Join Us Help shape the future of data in an organization that More ❯
/semantics in an Azure environment. Strong ADF, DataBricks, SQL, Python, Power BI Data acquisition from various data sources including Salesforce, API, XML, JSON, Parquet, flat file systems and relational data. Excellent team player able to work under pressure. Effective communication and collaboration skills to work with cross-functional More ❯
Herndon, Virginia, United States Hybrid / WFH Options
Maxar Technologies Holdings Inc
with Python. Demonstrated experience building & orchestrating automated, production-level data pipelines and solutions (ETL/ELT). Experience with file-based data storage, including Parquet or Iceberg. Experience with data catalogs (ex. Hive, AWS Glue). General understanding of key AWS services (e.g. EC2, S3, EKS, IAM, lambda). More ❯
processing applications, secure data access tools. Experience integrating data driven applications with different data sources, for example: SQL Databases Document Databases (MongoDB, CosmosDB etc) Parquet Experience of taking different business applications and use cases and supporting their needs (query patterns etc) within appropriate data solutions, whilst maintaining data integrity More ❯
solutions • Contribute to infrastructure automation using CI/CD practices and infrastructure as code • Work with various data formats including JSON, XML, CSV, and Parquet • Create and maintain metadata, data dictionaries, and schema documentation Required Experience: • Strong experience with data engineering in a cloud-first environment • Hands-on expertise More ❯
solutions • Contribute to infrastructure automation using CI/CD practices and infrastructure as code • Work with various data formats including JSON, XML, CSV, and Parquet • Create and maintain metadata, data dictionaries, and schema documentation Required Experience: • Strong experience with data engineering in a cloud-first environment • Hands-on expertise More ❯
solutions • Contribute to infrastructure automation using CI/CD practices and infrastructure as code • Work with various data formats including JSON, XML, CSV, and Parquet • Create and maintain metadata, data dictionaries, and schema documentation Required Experience: • Strong experience with data engineering in a cloud-first environment • Hands-on expertise More ❯
with experience in Kafka Real-time messaging or Azure Stream Analytics/Event Hub. Spark processing and performance tuning. File formats partitioning for e.g. Parquet, JSON, XML, CSV. Azure DevOps, GitHub actions. Hands-on experience in at least one of Python with knowledge of the others. Experience in Data More ❯
and automation. Proficiency in building and maintaining batch and streaming ETL/ELT pipelines at scale, employing tools such as Airflow, Fivetran, Kafka, Iceberg, Parquet, Spark, Glue for developing end-to-end data orchestration leveraging on AWS services to ingest, transform and process large volumes of structured and unstructured More ❯
Starburst and Athena Kafka and Kinesis DataHub ML Flow and Airflow Docker and Terraform Kafka, Spark, Kafka Streams and KSQL DBT AWS, S3, Iceberg, Parquet, Glue and EMR for our Data Lake Elasticsearch and DynamoDB More information: Enjoy fantastic perks like private healthcare & dental insurance, a generous work from More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
similar tools Leading on solution deployment using infrastructure-as-code and CI/CD practices Transforming diverse data formats including JSON, XML, CSV, and Parquet Creating and maintaining clear technical documentation, metadata, and data dictionaries Your previous experience as Principal Data Engineer will include: Strong background across AWS data More ❯
similar tools Leading on solution deployment using infrastructure-as-code and CI/CD practices Transforming diverse data formats including JSON, XML, CSV, and Parquet Creating and maintaining clear technical documentation, metadata, and data dictionaries Your previous experience as Principal Data Engineer will include: Strong background across AWS data More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
streaming platforms including Spark Experience with deploying and managing Jupyter Notebook environments Experience with data parsing/transformation technologies including JSON, XMl, CSV, and Parquet formats Experience with stream/batch Big Data processing and analytic frameworks Experience with CI/CD principles, methodologies, and tools such as GitLab More ❯
stream/batch data processing and analytic frameworks Experience with data parsing/transformation technologies and file formats including JSON, XMl, CSV, TCLD, and Parquet General Cloud and HPC knowledge regarding computer, networking, memory, and storage components Experience with Linux administration including software integration, configuration management and routine O More ❯
Not Required Experience with Vega, Observable Plot, ggplot or another grammar-of-graphics library. Experience in Python, FastAPI Expertise in data engineering topics, SQL, parquet Experience with AWS services and serverless architectures. What we offer Work with colleagues that lift you up, challenge you, celebrate you and help you More ❯
and experience working within a data driven organization Hands-on experience with architecting, implementing, and performance tuning of: Data Lake technologies (e.g. Delta Lake, Parquet, Spark, Databricks) API & Microservices Message queues, streaming technologies, and event driven architecture NoSQL databases and query languages Data domain and event data models Data More ❯
Good understanding of cloud environments (ideally Azure), distributed computing and scaling workflows and pipelines Understanding of common data transformation and storage formats, e.g. ApacheParquet Awareness of data standards such as GA4GH ( ) and FAIR ( ). Exposure of genotyping and imputation is highly advantageous Benefits: Competitive base salary Generous Pension More ❯
large-scale datasets. Implement and manage Lake Formation and AWS Security Lake , ensuring data governance, access control, and security compliance. Optimise file formats (e.g., Parquet, ORC, Avro) for S3 storage , ensuring efficient querying and cost-effectiveness. Automate infrastructure deployment using Infrastructure as Code (IaC) tools such as Terraform or More ❯
large-scale datasets. Implement and manage Lake Formation and AWS Security Lake , ensuring data governance, access control, and security compliance. Optimise file formats (e.g., Parquet, ORC, Avro) for S3 storage , ensuring efficient querying and cost-effectiveness. Automate infrastructure deployment using Infrastructure as Code (IaC) tools such as Terraform or More ❯
large-scale datasets. Implement and manage Lake Formation and AWS Security Lake , ensuring data governance, access control, and security compliance. Optimise file formats (e.g., Parquet, ORC, Avro) for S3 storage , ensuring efficient querying and cost-effectiveness. Automate infrastructure deployment using Infrastructure as Code (IaC) tools such as Terraform or More ❯
and Lambda IAM - Experience handling IAM resource permissions Networking - fundamental understanding of VPC, subnet routing and gateways Storage - strong understanding of S3, EBS and Parquet Databases - RDS, DynamoDB Experience doing cost estimation in Cost Explorer and planning efficiency changes Terraform and containerisation experience Understanding of a broad range of More ❯
Cardiff, South Glamorgan, United Kingdom Hybrid / WFH Options
RVU Co UK
e.g. duckdb, polars, daft). Familiarity with eventing technologies (Event Hubs, Kafka etc ). Deep understanding of file formats and their behaviour such as parquet, delta and iceberg. What we offer We want to give you a great work environment; contribute back to both your personal and professional development More ❯