relevant professional certifications . Advanced SQL knowledge for database querying. Proficiency with big data tools (Hadoop, Spark) and familiarity with big data file formats (Parquet, Avro). Skilled in data pipeline and workflow management tools (Apache Airflow, NiFi). Strong background in programming (Python, Scala, Java) for data pipeline More ❯
Looker, etc.) Interest or experience in building internal data communities or enablement programs Working with diverse data sources (APIs, CRMs, SFTP, databases) and formats (Parquet, JSON, XML, CSV) Exposure to machine learning models or AI agents Why Join Us Help shape the future of data in an organization that More ❯
automation) using one of more programming languages Python, Java/or Scala. Knowledge of NoSQL and RDBMS databases Experience in different data formats (Avro, Parquet) Have a collaborative and co-creative mindset with excellent communication skills Motivated to work in an environment that allows you to work and take More ❯
automation) using one of more programming languages Python, Java/or Scala. Knowledge of NoSQL and RDBMS databases Experience in different data formats (Avro, Parquet) Have a collaborative and co-creative mindset with excellent communication skills Motivated to work in an environment that allows you to work and take More ❯
Herndon, Virginia, United States Hybrid / WFH Options
Maxar Technologies Holdings Inc
with Python. Demonstrated experience building & orchestrating automated, production-level data pipelines and solutions (ETL/ELT). Experience with file-based data storage, including Parquet or Iceberg. Experience with data catalogs (ex. Hive, AWS Glue). General understanding of key AWS services (e.g. EC2, S3, EKS, IAM, lambda). More ❯
processing applications, secure data access tools. Experience integrating data driven applications with different data sources, for example: SQL Databases Document Databases (MongoDB, CosmosDB etc) Parquet Experience of taking different business applications and use cases and supporting their needs (query patterns etc) within appropriate data solutions, whilst maintaining data integrity More ❯
data, etc. Personal skills and experience Solid experience with Python. Able to propose and design Big Data ETLs. Knowledge of Spark, Hadoop, etc. using Parquet file format Mastering SQL queries and data models Hands-on AWS experience, with a focus on data & analytics Infrastructure automation for both cloud-based More ❯
in data integration/ETL development, including ELT patterns and hands-on experience with Matillion Skilled in handling structured and unstructured data (JSON, XML, Parquet, etc.) Comfortable working in Linux and cloud-native environments Strong SQL skills and experience with relational databases Knowledge of CI/CD processes and More ❯
with experience in Kafka Real-time messaging or Azure Stream Analytics/Event Hub. Spark processing and performance tuning. File formats partitioning for e.g. Parquet, JSON, XML, CSV. Azure DevOps, GitHub actions. Hands-on experience in at least one of Python with knowledge of the others. Experience in Data More ❯
and automation. Proficiency in building and maintaining batch and streaming ETL/ELT pipelines at scale, employing tools such as Airflow, Fivetran, Kafka, Iceberg, Parquet, Spark, Glue for developing end-to-end data orchestration leveraging on AWS services to ingest, transform and process large volumes of structured and unstructured More ❯
pipelines which fetch data from public and private data suppliers' APIs, S3 buckets, and web interfaces in various formats (e.g., JSON, CSV, Excel, PDF, Parquet), join geographical shapes with data from multiple sources, and perform various transformations. Create programmatically validated data schemas, as well as human-readable documentation, to More ❯
/ML to extract, format, and expose in indexed search tools relevant content such as raw text, multimedia (audio, image, video, document), tabular (CSV, Parquet, Avro) or nested (JSON, JSONL, XML), and other structured/unstructured data types. Data is expected to be of varying formats, schemas, and structures. More ❯
Azure service bus, Function Apps, ADFs Possesses knowledge on data related technologies like - Data Warehouse, snowflake, ETL, Data pipelines, pyspark, delta tables, file formats - parquet, columnar Have a good understanding of SQL, stored procedures Be able to lead development and execution of performance and automation testing for large-scale More ❯
Starburst and Athena Kafka and Kinesis DataHub ML Flow and Airflow Docker and Terraform Kafka, Spark, Kafka Streams and KSQL DBT AWS, S3, Iceberg, Parquet, Glue and EMR for our Data Lake Elasticsearch and DynamoDB More information: Enjoy fantastic perks like private healthcare & dental insurance, a generous work from More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
Experience in data modelling and design patterns; in-depth knowledge of relational databases (PostgreSQL) and familiarity with data lakehouse formats (storage formats, e.g. ApacheParquet, Delta tables). Experience with Spark, Databricks, data lakes/lakehouses. Experience working with external data suppliers (defining requirements for suppliers, defining Service Level More ❯
similar tools Leading on solution deployment using infrastructure-as-code and CI/CD practices Transforming diverse data formats including JSON, XML, CSV, and Parquet Creating and maintaining clear technical documentation, metadata, and data dictionaries Your previous experience as Principal Data Engineer will include: Strong background across AWS data More ❯
similar tools Leading on solution deployment using infrastructure-as-code and CI/CD practices Transforming diverse data formats including JSON, XML, CSV, and Parquet Creating and maintaining clear technical documentation, metadata, and data dictionaries Your previous experience as Principal Data Engineer will include: Strong background across AWS data More ❯
large collaboration and development environments. • Experience with data types including unstructured, structured, or semi-structured data such as CSV, JSON, JSONL, AVRO, Protocol Buffers, Parquet, etc. • Experience with designing cloud-native architectures using cloud services. (Preferred) • Experience designing and operating big data systems within policy and regulatory environment. (Preferred More ❯
leeds, west yorkshire, yorkshire and the humber, United Kingdom
Anson McCade
cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a More ❯
cleanse data using a range of tools and techniques. Manage and process structured and semi-structured data formats such as JSON, XML, CSV, and Parquet . Operate effectively in Linux and cloud-based environments . Support CI/CD processes and adopt infrastructure-as-code principles. Contribute to a More ❯
Not Required Experience with Vega, Observable Plot, ggplot or another grammar-of-graphics library. Experience in Python, FastAPI Expertise in data engineering topics, SQL, parquet Experience with AWS services and serverless architectures. What we offer Work with colleagues that lift you up, challenge you, celebrate you and help you More ❯
and experience working within a data driven organization Hands-on experience with architecting, implementing, and performance tuning of: Data Lake technologies (e.g. Delta Lake, Parquet, Spark, Databricks) API & Microservices Message queues, streaming technologies, and event driven architecture NoSQL databases and query languages Data domain and event data models Data More ❯
Java Experience with full lifecycle agile software development projects Desired skills: Experience with Python. Experience building data products in Apache Avro and/or Parquet On-the-job experience with Java software development. Experience deploying the complete DevOps Lifecycle including integration of build pipelines, automated deployments, and compliance scanning More ❯
new technologies and frameworks Nice to have: Knowledge of databases, SQL Familiarity with Boost ASIO Familiarity with data serialization formats such Apache Arrow/Parquet, Google Protocol Buffers, Flatbuffers Contra Experience with gRPC, http/REST and Websocket protocols Experience with Google Cloud/AWS and/or containerization More ❯