extract data from diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging … SQL for data manipulation and scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines and version control systems like Git. Knowledge of ETL tools and … technologies such as Apache Airflow, Informatica, or Talend. Knowledge of data governance and best practices in data management. Familiarity with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL (for database management and querying More ❯
In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, ApacheSpark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and … significant impact, we encourage you to apply! Job Responsibilities ETL/ELT Pipeline Development: Design, develop, and optimize efficient and scalable ETL/ELT pipelines using Python, PySpark, and Apache Airflow. Implement batch and real-time data processing solutions using Apache Spark. Ensure data quality, governance, and security throughout the data lifecycle. Cloud Data Engineering: Manage and optimize … effectiveness. Implement and maintain CI/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Develop and optimize large-scale data processing pipelines using ApacheSpark and PySpark. Implement data partitioning, caching, and performance tuning techniques to enhance Spark-based workloads. Work with diverse data formats (structured and unstructured) to support advanced More ❯
In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, ApacheSpark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and … to apply! Job Responsibilities Data Engineering & Data Pipeline Development Design, develop, and optimize scalable DATA workflows using Python, PySpark, and Airflow Implement real-time and batch data processing using Spark Enforce best practices for data quality, governance, and security throughout the data lifecycle Ensure data availability, reliability and performance through monitoring and automation. Cloud Data Engineering : Manage cloud infrastructure … data processing workloads Implement CI/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Build and optimize large-scale data processing pipelines using ApacheSpark and PySpark Implement data partitioning, caching, and performance tuning for Spark-based workloads. Work with diverse data formats (structured and unstructured) to support advanced analytics and More ❯
FTPS), and remediation of security vulnerabilities (DAST, Azure Defender). Expertise in Python for writing efficient code and maintaining reusable libraries. Experienced with microservice design patterns, and Databricks/Spark for big data processing. Strong knowledge of SQL/NoSQL databases corresponding ELT workflows. Excellent problem-solving, communication, and collaboration skills in fast-paced environments. 3 years' professional experience More ❯
two of the following: Python, SQL, Java Commercial experience in client-facing projects is a plus, especially within multi-disciplinary teams Deep knowledge of database technologies: Distributed systems (e.g., Spark, Hadoop, EMR) RDBMS (e.g., SQL Server, Oracle, PostgreSQL, MySQL) NoSQL (e.g., MongoDB, Cassandra, DynamoDB, Neo4j) Solid understanding of software engineering best practices - code reviews, testing frameworks, CI/CD More ❯
Functions, Azure SQL Database, HDInsight, and Azure Machine Learning Studio. Data Storage & Databases: SQL & NoSQL Databases: Experience with databases like PostgreSQL, MySQL, MongoDB, and Cassandra. Big Data Ecosystems: Hadoop, Spark, Hive, and HBase. Data Integration & ETL: Data Pipelining Tools: Apache NiFi, Apache Kafka, and Apache Flink. ETL Tools: AWS Glue, Azure Data Factory, Talend, and ApacheMore ❯
real-time data pipelines for processing large-scale data. Experience with ETL processes for data ingestion and processing. Proficiency in Python and SQL. Experience with big data technologies like Apache Hadoop and Apache Spark. Familiarity with real-time data processing frameworks such as Apache Kafka or Flink. MLOps & Deployment: Experience deploying and maintaining large-scale ML inference More ❯
Skills: Proven expertise in designing, building, and operating data pipelines, warehouses, and scalable data architectures. Deep hands-on experience with modern data stacks. Our tech includes Python, SQL, Snowflake, Apache Iceberg, AWS S3, PostgresDB, Airflow, dbt, and ApacheSpark, deployed via AWS, Docker, and Terraform. Experience with similar technologies is essential. Coaching & Growth Mindset: Passion for developing More ❯
Skills: Proven expertise in designing, building, and operating data pipelines, warehouses, and scalable data architectures. Deep hands-on experience with modern data stacks. Our tech includes Python, SQL, Snowflake, Apache Iceberg, AWS S3, PostgresDB, Airflow, dbt, and ApacheSpark, deployed via AWS, Docker, and Terraform. Experience with similar technologies is essential. Coaching & Growth Mindset: Passion for developing More ❯
on platforms such as AWS, Azure, GCP, and Snowflake. Understanding of cloud platform infrastructure and its impact on data architecture. A solid understanding of big data technologies such as ApacheSpark, and knowledge of Hadoop ecosystems. Knowledge of programming languages such as Python, R, or Java is beneficial. Exposure to ETL/ELT processes, SQL, NoSQL databases is More ❯
listed below. AI techniques (supervised and unsupervised machine learning, deep learning, graph data analytics, statistical analysis, time series, geospatial analysis, NLP, sentiment analysis, pattern detection, etc.) Python, R, or Spark for data insights Data Bricks/Data QISQL for data access and processing (PostgreSQL preferred, but general SQL knowledge is important) Latest Data Science platforms (e.g., Databricks, Dataiku, AzureML … SageMaker) and frameworks (e.g., TensorFlow, MXNet, scikit-learn) Software engineering practices (coding standards, unit testing, version control, code review) Hadoop distributions (Cloudera, Hortonworks), NoSQL databases (Neo4j, Elastic), streaming technologies (Spark Streaming) Data manipulation and wrangling techniques Development and deployment technologies (virtualisation, CI tools like Jenkins, configuration management with Ansible, containerisation with Docker, Kubernetes) Data visualization skills (JavaScript preferred) Experience More ❯
priorities aimed at maximizing value through data utilization. Knowled g e/Experience Expertise in Commercial/Procurement Analytics. Experience in SAP (S/4 Hana). Experience with Spark, Databricks, or similar data processing tools. Stron g technical proficiency in data modelin g , SQL, NoSQL databases, and data warehousing . Hands-on experience with data pipeline development, ETL … processes, and big data technolo g ies (e. g ., Hadoop, Spark, Kafka). Proficiency in cloud platforms such as AWS, Azure, or Goo g le Cloud and cloud-based data services (e.g ., AWS Redshift, Azure Synapse Analytics, Goog le Bi g Query). Experience with DataOps practices and tools, includin g CI/CD for data pipelines. More ❯
priorities aimed at maximizing value through data utilization. Knowled g e/Experience Expertise in Commercial/Procurement Analytics. Experience in SAP (S/4 Hana). Experience with Spark, Databricks, or similar data processing tools. Stron g technical proficiency in data modelin g , SQL, NoSQL databases, and data warehousing . Hands-on experience with data pipeline development, ETL … processes, and big data technolo g ies (e. g ., Hadoop, Spark, Kafka). Proficiency in cloud platforms such as AWS, Azure, or Goo g le Cloud and cloud-based data services (e.g ., AWS Redshift, Azure Synapse Analytics, Goog le Bi g Query). Experience with DataOps practices and tools, includin g CI/CD for data pipelines. More ❯
South West London, London, United Kingdom Hybrid / WFH Options
TALENT INTERNATIONAL UK LTD
capacity Strong proficiency in Python for data processing and automation Deep knowledge of ETL/ELT frameworks and best practices Hands-on experience with Big Data tools (e.g. Hadoop, Spark, Kafka, Hive) Familiarity with cloud data platforms (e.g. AWS, Azure, GCP) Strong understanding of data architecture, pipelines, warehousing, and performance tuning Excellent communication and stakeholder engagement skills Desirables: Experience More ❯
of Relational Databases and Data Warehousing concepts. Experience of Enterprise ETL tools such as Informatica, Talend, Datastage or Alteryx. Project experience using the any of the following technologies: Hadoop, Spark, Scala, Oracle, Pega, Salesforce. Cross and multi-platform experience. Team building and leading. You must be: Willing to work on client sites, potentially for extended periods. Willing to travel More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
ADLIB Recruitment
systems Clear communicator, able to translate complex data concepts to cross-functional teams Bonus points for experience with: DevOps tools like Docker, Kubernetes, CI/CD Big data tools (Spark, Hadoop), ETL workflows, or high-throughput data streams Genomic data formats and tools Cold and hot storage management, ZFS/RAID systems, or tape storage AI/LLM tools More ❯
Bristol, England, United Kingdom Hybrid / WFH Options
ADLIB Recruitment
systems Clear communicator, able to translate complex data concepts to cross-functional teams Bonus points for experience with: DevOps tools like Docker, Kubernetes, CI/CD Big data tools (Spark, Hadoop), ETL workflows, or high-throughput data streams Genomic data formats and tools Cold and hot storage management, ZFS/RAID systems, or tape storage AI/LLM tools More ❯
Maths or similar Science or Engineering discipline Strong Python and other programming skills (Java and/or Scala desirable) Strong SQL background Some exposure to big data technologies (Hadoop, spark, presto, etc.) NICE TO HAVES OR EXCITED TO LEARN: Some experience designing, building and maintaining SQL databases (and/or NoSQL) Some experience with designing efficient physical data models More ❯
in data engineering, architecture, or platform management roles, with 5+ years in leadership positions. Expertise in modern data platforms (e.g., Azure, AWS, Google Cloud) and big data technologies (e.g., Spark, Kafka, Hadoop). Strong knowledge of data governance frameworks, regulatory compliance (e.g., GDPR, CCPA), and data security best practices. Proven experience in enterprise-level architecture design and implementation. Hands More ❯
cron jobs , job orchestration, and error monitoring tools. Good to have Experience with Azure Bicep or other Infrastructure-as-Code tools. Exposure to real-time/streaming data (Kafka, Spark Streaming, etc.). Understanding of data mesh , data contracts , or domain-driven data architecture . Hands on experience with MLflow and Llama Apply for this job indicates a required More ❯
Python Extensive experience with cloud platforms (AWS, GCP, or Azure) Experience with: Data warehousing and lake architectures ETL/ELT pipeline development SQL and NoSQL databases Distributed computing frameworks (Spark, Kinesis etc) Software development best practices including CI/CD, TDD and version control. Containerisation tools like Docker or Kubernetes Experience with Infrastructure as Code tools (e.g. Terraform or More ❯
SageMaker, GCP AI Platform, Azure ML, or equivalent). Solid understanding of data-engineering concepts: SQL/noSQL, data pipelines (Airflow, Prefect, or similar), and batch/streaming frameworks (Spark, Kafka). Leadership & Communication: Proven ability to lead cross-functional teams in ambiguous startup settings. Exceptional written and verbal communication skills—able to explain complex concepts to both technical More ❯
in Microsoft Fabric and Databricks, including data pipeline development, data warehousing, and data lake management Proficiency in Python, SQL, Scala, or Java Experience with data processing frameworks such as ApacheSpark, Apache Beam, or Azure Data Factory Strong understanding of data architecture principles, data modelling, and data governance Experience with cloud-based data platforms, including Azure and More ❯
tools to automate profit-and-loss forecasting and planning for the Physical Consumer business. We are building the next generation Business Intelligence solutions using big data technologies such as ApacheSpark, Hive/Hadoop, and distributed query engines. As a Data Engineer in Amazon, you will be working in a large, extremely complex and dynamic data environment. You … with ambiguity, and working in a fast-paced and ever-changing environment. Ideally, you are also experienced with at least one of the programming languages such as Java, C++, Spark/Scala, Python, etc. Major Responsibilities: - Work with a team of product and program managers, engineering leaders, and business leaders to build data architectures and platforms to support business More ❯
researching new technologies and software versions Working with cloud technologies and different operating systems Working closely alongside Data Engineers and DevOps engineers Working with big data technologies such as spark Demonstrating stakeholder engagement by communicating with the wider team to understand the functional and non-functional requirements of the data and the product in development and its relationship to … networks into production Experience with Docker Experience with NLP and/or computer vision Exposure to cloud technologies (eg. AWS and Azure) Exposure to Big data technologies Exposure to Apache products eg. Hive, Spark, Hadoop, NiFi Programming experience in other languages This is not an exhaustive list, and we are keen to hear from you even if you More ❯