data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like ApacheAirflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop … or Composer (ApacheAirflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like ApacheAirflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop … or Composer (ApacheAirflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like ApacheAirflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop … or Composer (ApacheAirflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like ApacheAirflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop … or Composer (ApacheAirflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like ApacheAirflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop … or Composer (ApacheAirflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large More ❯
production issues. Optimize applications for performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like Apache Spark , Databricks , Apache Pulsar , ApacheAirflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and concise documentation for software, processes … cloud platforms like AWS , GCP , or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with Apache Spark , Databricks , or similar big data platforms for processing … large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like ApacheAirflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. Desired Skills: Asynchronous Programming: Familiarity with asynchronous programming tools like Celery or asyncio . Frontend More ❯
data warehousing concepts and data modeling . Excellent problem-solving and communication skills focused on delivering high-quality solutions. Understanding or hands-on experience with orchestration tools such as ApacheAirflow . Deep knowledge of non-functional requirements such as availability , scalability , operability , and maintainability . #J-18808-Ljbffr More ❯
and well-tested solutions to automate data ingestion, transformation, and orchestration across systems. Own data operations infrastructure: Manage and optimise key data infrastructure components within AWS, including Amazon Redshift, ApacheAirflow for workflow orchestration and other analytical tools. You will be responsible for ensuring the performance, reliability, and scalability of these systems to meet the growing demands of … pipelines , data warehouses , and leveraging AWS data services . Strong proficiency in DataOps methodologies and tools, including experience with CI/CD pipelines, containerized applications , and workflow orchestration using ApacheAirflow . Familiar with ETL frameworks, and bonus experience with Big Data processing (Spark, Hive, Trino), and data streaming. Proven track record - You've made a demonstrable impact More ❯
database design, and data normalisation is required for the role. Equally, proficiency in Python and SQL is essential, ideally with experience using data processing frameworks such as Kafka, NoSQL, Airflow, TensorFlow, or Spark. Finally, experience with cloud platforms like AWS or Azure, including data services such as ApacheAirflow, Athena, or SageMaker, is essential for the is More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs, Kafka … Dataflow/Airflow/ADF etc. Excellent consulting experience and ability to design and build solutions, actively contribute to RfP response. Ability to be a SPOC for all technical discussions across industry groups. Excellent design experience, with entrepreneurship skills to own and lead solutions for clients Ability to define the monitoring, alerting, deployment strategies for various services. Experience providing … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Designing Databricks based solutions for More ❯
to-end, scalable data and AI solutions using the Databricks Lakehouse (Delta Lake, Unity Catalog, MLflow). Design and lead the development of modular, high-performance data pipelines using Apache Spark and PySpark. Champion the adoption of Lakehouse architecture (bronze/silver/gold layers) to ensure scalable, governed data platforms. Collaborate with stakeholders, analysts, and data scientists to … datasets. Promote CI/CD, DevOps, and data reliability engineering (DRE) best practices across Databricks environments. Integrate with cloud-native services and orchestrate workflows using tools such as dbt, Airflow, and Databricks Workflows. Drive performance tuning, cost optimisation, and monitoring across data workloads. Mentor engineering teams and support architectural decisions as a recognised Databricks expert. Essential Skills & Experience: Demonstrable … expertise with Databricks and Apache Spark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth understanding of Lakehouse concepts, medallion architecture, and modern data warehousing. Experience with version control, testing frameworks, and automated deployment pipelines (e.g., GitHub Actions, Azure DevOps). Sound knowledge of data governance More ❯
London, England, United Kingdom Hybrid / WFH Options
Apollo Solutions
Go or R for data manipulation and analysis, with the ability to build, maintain, and deploy sequences of automated processes Bonus Experience (Nice to Have) Familiarity with dbt, Fivetran, ApacheAirflow, Data Mesh, Data Vault 2.0, Fabric, and Apache Spark Experience working with streaming technologies such as Apache Kafka, Apache Flink, or Google Cloud Dataflow More ❯
normalisation is required for the role. Equally, strong ML experience, proficiency in Python and SQL knowledge is essential, ideally with experience using data processing frameworks such as Kafka, NoSQL, Airflow, TensorFlow, or Spark. Finally, experience with cloud platforms like AWS or Azure, including data services such as ApacheAirflow, Athena, or SageMaker, is essential. This is a More ❯
software teams Embrace variety and challenge Desirable Experience: Proficiency in at least one of Python, Go, Java, Ruby Working knowledge of Kubernetes Exposure to ETL systems at scale (e.g. ApacheAirflow, Argo Workflows) Exposure to streaming data platforms (e.g. Apache Kafka, RabbitMQ) Working knowledge of networking fundamentals Comfort deploying software to the cloud and on-premises Record More ❯
relational and NoSQL databases. Experience with data modelling. General understanding of data architectures and event-driven architectures. Proficient in SQL. Familiarity with one scripting language, preferably Python. Experience with ApacheAirflow & Apache Spark. Solid understanding of cloud data services: AWS services such as S3, Athena, EC2, RedShift, EMR (Elastic MapReduce), EKS, RDS (Relational Database Services) and Lambda. More ❯
to-end, scalable data and AI solutions using the Databricks Lakehouse (Delta Lake, Unity Catalog, MLflow). Design and lead the development of modular, high-performance data pipelines using Apache Spark and PySpark. Champion the adoption of Lakehouse architecture (bronze/silver/gold layers) to ensure scalable, governed data platforms. Collaborate with stakeholders, analysts, and data scientists to … datasets. Promote CI/CD, DevOps, and data reliability engineering (DRE) best practices across Databricks environments. Integrate with cloud-native services and orchestrate workflows using tools such as dbt, Airflow, and Databricks Workflows. Drive performance tuning, cost optimisation, and monitoring across data workloads. Mentor engineering teams and support architectural decisions as a recognised Databricks expert. Demonstrable expertise with Databricks … and Apache Spark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth understanding of Lakehouse concepts, medallion architecture, and modern data warehousing. Experience with version control, testing frameworks, and automated deployment pipelines (e.g., GitHub Actions, Azure DevOps). Sound knowledge of data governance, security, and compliance More ❯
Newcastle Upon Tyne, England, United Kingdom Hybrid / WFH Options
In Technology Group
warehousing. Proficiency in Python or another programming language used for data engineering. Experience with cloud platforms (e.g., Azure, AWS, or GCP) is highly desirable. Familiarity with tools such as ApacheAirflow, Spark, or similar is a plus. What’s On Offer: Competitive salary between £45,000 – £55,000 , depending on experience. Flexible hybrid working – 3 days on-site More ❯
and familiarity with templating approaches (e.g., Jinja). Hands-on experience with cloud technologies, ideally within AWS environments. Proven ability to work with orchestration platforms-experience with tools like ApacheAirflow is a plus. Comfortable developing CI/CD workflows, ideally using tools such as GitHub Actions. Experience building and maintaining modern data pipelines and infrastructure. Cooperative approach More ❯
London, England, United Kingdom Hybrid / WFH Options
Endava Limited
delivering high-quality solutions aligned with business objectives. Key Responsibilities Architect, implement, and maintain real-time and batch data pipelines to handle large datasets efficiently. Employ frameworks such as Apache Spark, Databricks, Snowflake, or Airflow to automate ingestion, transformation, and delivery. Data Integration & Transformation Work with Data Analysts to understand source-to-target mappings and quality requirements. Build … security measures (RBAC, encryption) and ensure regulatory compliance (GDPR). Document data lineage and recommend improvements for data ownership and stewardship. Qualifications Programming: Python, SQL, Scala, Java. Big Data: Apache Spark, Hadoop, Databricks, Snowflake, etc. Data Modelling: Designing dimensional, relational, and hierarchical data models. Scalability & Performance: Building fault-tolerant, highly available data architectures. Security & Compliance: Enforcing role-based access More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Jefferson Frank
business's data arm. Requirements: 3+ Years data engineering experience Snowflake experience Proficiency across an AWS tech stack DBT Expertise Terraform Experience Nice to Have: Data Modelling Data Vault ApacheAirflow Benefits: Up to 10% Bonus Up to 14% Pensions Contribution 29 Days Annual Leave + Bank Holidays Free Company Shares Interviews ongoing don't miss your chance More ❯
London, England, United Kingdom Hybrid / WFH Options
Connexity
/Masters) in computer science or a related field. Solid programming skills in both Python and SQL. Proven work experience in Google Cloud Platform or other clouds, developing batch (ApacheAirflow) and streaming (Dataflow) scalable data pipelines. Experience processing large datasets at scale (BigQuery, Apache Druid, Elasticsearch) Familiarity with Terraform, DBT & Looker is a plus. Passion around More ❯
Synapse Analytics with Spark and SQL, Azure functions with Python, Azure Purview, and Cosmos DB. They are also proficient in Azure Event Hub and Streaming Analytics, Managed Streaming for Apache Kafka, Azure DataBricks with Spark, and other open source technologies like ApacheAirflow and dbt, Spark/Python, or Spark/Scala. Preferred Education Bachelor's Degree More ❯
processes, including prompt logic, model usage, and data governance considerations, promoting transparency and responsible AI use. 7. Automate ETL pipeline orchestration and data processing workflows: Leverage orchestration tools like ApacheAirflow, Perfect to schedule, automate, and manage ETL jobs, reducing manual intervention and improving operational reliability. 8. Implement monitoring, alerting, and troubleshooting for data workflows: Set up real … Bash. Knowledge of Linux-based operating systems. Experience with Amazon Web Services cloud platform or another cloud platform. Proven work experience with containerization and orchestration tools such as Docker, Airflow, Prefect and Kubernetes. Knowledge of cybersecurity best practices. Excellent communication and collaboration skills. Ability to work effectively in a fast-paced and dynamic environment. Self-discipline and delivery focused. More ❯
City of London, England, United Kingdom Hybrid / WFH Options
ACLED
English, problem-solving skills, attention to detail, ability to work remotely. Desirable: Cloud architecture certification (e.g., AWS Certified Solutions Architect). Experience with Drupal CMS, geospatial/mapping tools, ApacheAirflow, serverless architectures, API gateways. Interest in conflict data, humanitarian tech, open data platforms; desire to grow into a solution architect or technical lead role. Application Process Submit More ❯
future-proofing of the data pipelines. ETL and Automation Excellence: Lead the development of specialized ETL workflows, ensuring they are fully automated and optimized for performance using tools like ApacheAirflow, Snowflake, and other cloud-based technologies. Drive improvements across all stages of the ETL cycle, including data extraction, transformation, and loading. Infrastructure & Pipeline Enhancement: Spearhead the upgrading More ❯