Pipeline, Azure Synapse, Spark Notebooks, Azure Synapse Dedicated SQL Pool Warehouse ,Azure Databricks, Azure Functions and Azure Data Lake storage Proficiency dealing with various data formats like relational, json, parquet, streaming and others Hands on in Storng SQL,T-SQL and Python/PySpark Required Skills : Strong in azure data factory SOX Controls Automation experience Strong leadership and communication More ❯
Cardiff, Wales, United Kingdom Hybrid / WFH Options
Identify Solutions
Cloud and big data technologies (e.g. Spark/Databricks/Delta Lake/BigQuery). Familiarity with eventing technologies (e.g. Event Hubs/Kafka) and file formats such as Parquet/Delta/Iceberg. Want to learn more? Get in touch for an informal chat. More ❯
scalable data pipelines using PySpark 3/4 and Python 3. * Contribute to the creation of a unified data lake following medallion architecture principles. * Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing. * Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage. * Collaborate with cross-functional teams and participate in Agile More ❯
scalable data pipelines using PySpark 3/4 and Python 3. * Contribute to the creation of a unified data lake following medallion architecture principles. * Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing. * Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage. * Collaborate with cross-functional teams and participate in Agile More ❯
scalable data pipelines using PySpark 3/4 and Python 3. * Contribute to the creation of a unified data lake following medallion architecture principles. * Leverage Databricks and Delta Lake (Parquet format) for efficient, reliable data processing. * Apply BDD testing practices using Python Behave and ensure code quality with Python Coverage. * Collaborate with cross-functional teams and participate in Agile More ❯
ML workflows is a plus. Hands-on experience with multi-terabyte scale data processing. Familiarity with AWS; Kubernetes experience is a bonus. Knowledge of data lake technologies such as Parquet, Iceberg, AWS Glue etc. Strong Python software engineering skills. Pragmatic mindset - able to evaluate tradeoffs find solutions that empower ML researchers to move quickly. Background in bioinformatics or chemistry More ❯
globally distributed environment. Ideal, But Not Required Experience with Vega, Observable Plot, ggplot or another grammar-of-graphics library. Experience in Python, FastAPI Expertise in data engineering topics, SQL, parquet Experience with AWS services and serverless architectures. What we offer Work with colleagues that lift you up, challenge you, celebrate you and help you grow. We come from many More ❯
design workshops including estimating, scoping and delivering customer proposals aligned with Analytics Solutions - Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro, Parquet, Iceberg, Hudi) - Experience developing software and data engineering code in one or more programming languages (Java, Python, PySpark, Node, etc) - AWS and other Data and AI aligned Certifications PREFERRED More ❯
Annapolis Junction, Maryland, United States Hybrid / WFH Options
Halogen Engineering Group, Inc
software, libraries, and packages involving stream/batch data processing and analytic frameworks Experience with data parsing/transformation technologies and file formats including JSON, XML, CSV, TCLD, and Parquet General Cloud and HPC knowledge regarding computer, networking, memory, and storage components Experience with Linux administration including software integration, configuration management and routine O&M operations related to provisioning More ❯
Pipeline, Azure Synapse, Spark Notebooks, Azure Synapse Dedicated SQL Pool Warehouse ,Azure Databricks, Azure Functions and Azure Data Lake storage Proficiency dealing with various data formats like relational, json, parquet, streaming and others Hands on in Storng SQL,T-SQL and Python/PySpark Required Skills : Must be onsite in Atlanta Azure Data Factory Automation Shell Scripting Azure Basic … Pipeline, Azure Synapse, Spark Notebooks, Azure Synapse Dedicated SQL Pool Warehouse ,Azure Databricks, Azure Functions and Azure Data Lake storage Proficiency dealing with various data formats like relational, json, parquet, streaming and others Hands on in Storng SQL,T-SQL and Python/PySpark ? Additional Skills : Creating data ingestion and transformation pipelines using Synapse Pipeline/Azure Data Factory … Pipeline, Azure Synapse, Spark Notebooks, Azure Synapse Dedicated SQL Pool Warehouse ,Azure Databricks, Azure Functions and Azure Data Lake storage Proficiency dealing with various data formats like relational, json, parquet, streaming and others Hands on in Storng SQL,T-SQL and Python/PySpark ? Background Check :Yes Drug Screen :Yes Notes : Selling points for candidate : Project Verification Info :"The More ❯
Java Experience with Big Data streaming platforms including Spark Experience with deploying and managing Jupyter Notebook environments Experience with data parsing/transformation technologies including JSON, XMl, CSV, and Parquet formats Experience with stream/batch Big Data processing and analytic frameworks Experience with CI/CD principles, methodologies, and tools such as GitLab CI Experience with IaC (Infrastructure More ❯
software, libraries, and packages involving stream/batch data processing and analytic frameworks Experience with data parsing/transformation technologies and file formats including JSON, XMl, CSV, TCLD, and Parquet General Cloud and HPC knowledge regarding computer, networking, memory, and storage components Experience with Linux administration including software integration, configuration management and routine O&M operations related to provisioning More ❯
options such as ECS, EKS, and Lambda IAM - Experience handling IAM resource permissions Networking - fundamental understanding of VPC, subnet routing and gateways Storage - strong understanding of S3, EBS and Parquet Databases - RDS, DynamoDB Experience doing cost estimation in Cost Explorer and planning efficiency changes Terraform and containerisation experience Understanding of a broad range of protocols like HTTP, TCP, gRPC More ❯
Washington, Washington DC, United States Hybrid / WFH Options
Initiate Government Solutions
onsite client meetings as requested. Responsibilities and Duties (Included but not limited to): ETL (Extract, Transform, and Load) to put data into a variety of target formats (text, SQL, Parquet, CSV, MDF, IRIS) Model data tables and make them practical and usable within the evolving data syndication database architecture Design the logical and physical schemas needed to support an More ❯
with REST APIs Experience with Java Experience with full lifecycle agile software development projects Desired skills: Experience with Python. Experience building data products in Apache Avro and/or Parquet On-the-job experience with Java software development. Experience deploying the complete DevOps Lifecycle including integration of build pipelines, automated deployments, and compliance scanning using test driven development. Responsibilities More ❯
Leatherhead, England, United Kingdom Hybrid / WFH Options
JCW
with Azure Integration Services (e.g., Logic Apps, ADF, Service Bus, Functions) Comfortable working with Git , Azure DevOps , and unit testing practices Knowledge of common data formats: CSV, JSON, XML, Parquet Ability to lead integration designs with minimal rework required 🧾 Preferred Qualifications 🎓 Certification in SSIS or relevant Microsoft technologies 💡 Proven track record of delivering robust integration solutions 🧠 Key Skills & Traits More ❯
as SQL Server, PostgreSQL, Teradata and others. • Proficiency in technologies in the Apache Hadoop ecosystem, especially Hive, Impala and Ranger • Experience working with open file and table formats such Parquet, AVRO, ORC, Iceberg and Delta Lake • Extensive knowledge of automation and software development tools and methodologies. • Excellent working knowledge of Linux. Good working networking knowledge. • Ability to gain customer More ❯
Job Description AWS Stack, data being landed in S3, Lambda triggers, Data Quality, data written back out to AWS S3(Parquet Formats), Snowflake for dimensional model. Design and build the data pipelines, work with someone around understanding data transformation, this is supported by BA's, building out the data pipelines, moving into layers in the data architecture (Medallion architecture More ❯
technical standards, and drive team alignment Work closely with stakeholders to translate business needs into scalable solutions Tech environment includes Python, SQL, dbt, Databricks, BigQuery, Delta Lake, Spark, Kafka, Parquet, Iceberg (If you haven’t worked with every tool, that’s totally fine — my client values depth of thinking and engineering craft over buzzword familiarity.) What they’re looking More ❯
technical standards, and drive team alignment Work closely with stakeholders to translate business needs into scalable solutions Tech environment includes Python, SQL, dbt, Databricks, BigQuery, Delta Lake, Spark, Kafka, Parquet, Iceberg (If you haven’t worked with every tool, that’s totally fine — my client values depth of thinking and engineering craft over buzzword familiarity.) What they’re looking More ❯
and/or Rust Experience with distributed data processing frameworks such as PySpark Experience with agentic learning models Experience using MLOps frameworks and components (e.g. DVC, Horovod, Spark, ONNX, Parquet) Familiarity with SQL and modern database technologies (e.g., MinIO, Yugabyte) Understanding of secure software development practices and/or experience working in classified environments Ability to build and manage More ❯
data Proficiency with Linux development, Git, containers, and CI/CD workflows Familiarity with SQL and at least one columnar or time-series data store (e.g., kdb+, ClickHouse, InfluxDB, Parquet/Arrow) Excellent problem-solving abilities, attention to detail, and clear communication skills Nice To Have: Prior exposure with execution algos, TCA, order-routing, or market-impact modelling Knowledge More ❯
services 5+ Years of overall software engineering experience Experience with tech stack including: Language: Python, Golang Platform: AWS Framework: Django, Spark Storage/Data Pipelines: Postgres, Redis, ElasticSearch, Kafka, Parquet Nice To Have Prior exposure to production machine learning systems. More ❯
key performance indicators (KPIs) to evaluate project success. Develop workflows or perform workflow analysis to optimize business processes. Utilize knowledge of relational database concepts, unstructured data concepts (e.g., JSON, Parquet), SQL, and client-server concepts to support data analysis and technical solutions. Work with project teams to facilitate the accomplishment of project activities, providing technical support and data analysis. … to assist in implementing technology solutions. Proficiency in documenting user needs, program functions, project specifications, and business process flows. Familiarity with relational database concepts, unstructured data concepts (e.g., JSON, Parquet), SQL, and client-server concepts. Excellent communication and interpersonal skills. Ability to work independently and as part of a team. Desired qualifications/non-essential skills required: Experience in More ❯