ongoing operations of scalable, performant data warehouse (Redshift) tables, data pipelines, reports and dashboards. Development of moderately to highly complex data processing jobs using appropriate technologies (e.g. SQL, Python, Spark, AWS Lambda, etc.) Development of dashboards and reports. Collaborating with stakeholders to understand business domains, requirements, and expectations. Additionally, working with owners of data source systems to understand capabilities More ❯
London, England, United Kingdom Hybrid / WFH Options
IDEXX Laboratories, Inc
Docker/Kubernetes). Experience working in environments with AI/ML components or interest in learning data workflows for ML applications. Bonus if you have exposure to Kafka, Spark, or Flink . Experience with data compliance regulations (GDPR). What you can expect from us: Opportunity for annual bonuses Medical Insurance Cycle to work scheme Work from home More ❯
of the challenges of dealing with large data sets, both structured and unstructured Used a range of open source frameworks and development tools, e.g. NumPy/SciPy/Pandas, Spark, Kafka, Flink Working knowledge of one or more relevant database technologies, e.g. Oracle, Postgres, MongoDB, ArcticDB. Proficient on Linux Advantageous: An excellent understanding of financial markets and instruments An More ❯
MLflow, DVC, Docker, Kubernetes Software development experience is desirable Algorithm design experience is desirable Data architecture knowledge is desirable API design and deployment experience is desirable Big data (e.g. Spark) experience is desirable NoSQL DB experience is desirable Qualifications 5+ years of data science experience Right to work in UK and/or EU Azure Data Science Associate Certificate More ❯
London, England, United Kingdom Hybrid / WFH Options
Canonical
operators and Linux open source infrastructure-as-code. Work across the entire Linux stack, from kernel, networking, storage, to applications, Architect cloud infrastructure solutions like Kubernetes, Kubeflow, OpenStack and Spark, Deliver solutions either on-premise or in public cloud (AWS, Azure, Google Cloud), Collect customer business requirements and advise them on Ubuntu and relevant open source applications, Grow a More ❯
MLflow, DVC, Docker, Kubernetes Software development experience is desirable Algorithm design experience is desirable Data architecture knowledge is desirable API design and deployment experience is desirable Big data (e.g. Spark) experience is desirable NoSQL DB experience is desirable Qualifications 5+ years of data science experience Right to work in UK and/or EU Azure Data Science Associate Certificate More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (ApacheSpark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apachespark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills Designing Databricks based More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (ApacheSpark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apachespark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills Designing Databricks based More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (ApacheSpark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apachespark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills Designing Databricks based More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (ApacheSpark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apachespark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills Designing Databricks based More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like Apache Airflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (ApacheSpark/Hadoop … or Composer (Apache Airflow). Proficiency in at least one scripting/programming (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large datasets. Expertise More ❯
to-end, scalable data and AI solutions using the Databricks Lakehouse (Delta Lake, Unity Catalog, MLflow). Design and lead the development of modular, high-performance data pipelines using ApacheSpark and PySpark. Champion the adoption of Lakehouse architecture (bronze/silver/gold layers) to ensure scalable, governed data platforms. Collaborate with stakeholders, analysts, and data scientists … Databricks Workflows. Drive performance tuning, cost optimisation, and monitoring across data workloads. Mentor engineering teams and support architectural decisions as a recognised Databricks expert. Demonstrable expertise with Databricks and ApacheSpark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth understanding of Lakehouse concepts, medallion More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Signify Technology
data loads, and data pipeline monitoring. Develop and optimise data pipelines for integrating structured and unstructured data from various internal and external sources. Leverage big data technologies such as ApacheSpark, Kafka, and Scala to build robust and scalable data processing systems. Write clean, maintainable code in Python or Scala to support data transformation, orchestration, and integration tasks. … issues and optimise system performance. Qualifications: Proficient in handling multiple data sources and integrating data across different systems. +4 years experience as a Data engineer Hands-on expertise in Spark, Kafka, and other distributed data processing frameworks. Solid programming skills in Python Strong familiarity with cloud data ecosystems, especially AWS. Strong knowledge of DBT and Snowflake Strong problem-solving More ❯
data loads, and data pipeline monitoring. Develop and optimise data pipelines for integrating structured and unstructured data from various internal and external sources. Leverage big data technologies such as ApacheSpark, Kafka, and Scala to build robust and scalable data processing systems. Write clean, maintainable code in Python or Scala to support data transformation, orchestration, and integration tasks. … issues and optimise system performance. Qualifications: Proficient in handling multiple data sources and integrating data across different systems. +4 years experience as a Data engineer Hands-on expertise in Spark, Kafka, and other distributed data processing frameworks. Solid programming skills in Python Strong familiarity with cloud data ecosystems, especially AWS. Strong knowledge of DBT and Snowflake Strong problem-solving More ❯
London, England, United Kingdom Hybrid / WFH Options
Endava Limited
delivering high-quality solutions aligned with business objectives. Key Responsibilities Architect, implement, and maintain real-time and batch data pipelines to handle large datasets efficiently. Employ frameworks such as ApacheSpark, Databricks, Snowflake, or Airflow to automate ingestion, transformation, and delivery. Data Integration & Transformation Work with Data Analysts to understand source-to-target mappings and quality requirements. Build … security measures (RBAC, encryption) and ensure regulatory compliance (GDPR). Document data lineage and recommend improvements for data ownership and stewardship. Qualifications Programming: Python, SQL, Scala, Java. Big Data: ApacheSpark, Hadoop, Databricks, Snowflake, etc. Data Modelling: Designing dimensional, relational, and hierarchical data models. Scalability & Performance: Building fault-tolerant, highly available data architectures. Security & Compliance: Enforcing role-based More ❯
and well-tested solutions to automate data ingestion, transformation, and orchestration across systems. Own data operations infrastructure: Manage and optimise key data infrastructure components within AWS, including Amazon Redshift, Apache Airflow for workflow orchestration and other analytical tools. You will be responsible for ensuring the performance, reliability, and scalability of these systems to meet the growing demands of data … pipelines , data warehouses , and leveraging AWS data services . Strong proficiency in DataOps methodologies and tools, including experience with CI/CD pipelines, containerized applications , and workflow orchestration using Apache Airflow . Familiar with ETL frameworks, and bonus experience with Big Data processing (Spark, Hive, Trino), and data streaming. Proven track record - You've made a demonstrable impact More ❯
London Market Insurance; Sound understanding of data warehousing concepts; Contributions to well-designed data platform and data warehouse solutions ETL/ELTs – SSIS & T-SQL, Data Factory, Databricks/ApacheSpark; Data modelling; Strong communication skills and able to build relationships and trust with our stakeholders; Working within an agile delivery environment, leveraging tools such as Azure DevOps More ❯
data pipelines and systems Qualifications & Skills: x5 + experience with Python programming for data engineering tasks Strong proficiency in SQL and database management Hands-on experience with Databricks and ApacheSpark Familiarity with Azure cloud platform and related services Knowledge of data security best practices and compliance standards Excellent problem-solving and communication skills Multi-Year Project - Flexible More ❯
data pipelines and systems Qualifications & Skills: x5 + experience with Python programming for data engineering tasks Strong proficiency in SQL and database management Hands-on experience with Databricks and ApacheSpark Familiarity with Azure cloud platform and related services Knowledge of data security best practices and compliance standards Excellent problem-solving and communication skills Multi-Year Project - Flexible More ❯
London, England, United Kingdom Hybrid / WFH Options
Flutter
across finance, technology, and security to ensure data flows securely and efficiently from external providers into our financial platforms. Key Responsibilities Develop and maintain scalable data pipelines using Databricks, Spark, and Delta Lake to process large volumes of structured and semi-structured data. Design ETL/ELT workflows to extract data from third-party APIs and SFTP sources, standardise … fully documented and meet appropriate standards for security, resilience and operational support. Skills & Experience Required Essential: Hands-on experience developing data pipelines in Databricks, with a strong understanding of ApacheSpark and Delta Lake. Proficient in Python for data transformation and automation tasks. Solid understanding of AWS services, especially S3, Transfer Family, IAM, and VPC networking. Experience integrating More ❯
City of London, London, United Kingdom Hybrid / WFH Options
twentyAI
agile environment to deliver data solutions that support key firm initiatives. Build scalable and efficient batch and streaming data workflows within the Azure ecosystem. Apply distributed processing techniques using ApacheSpark to handle large datasets effectively. Help drive improvements in data quality, implementing validation, cleansing, and monitoring frameworks. Contribute to the firm’s efforts around data security, governance More ❯
agile environment to deliver data solutions that support key firm initiatives. Build scalable and efficient batch and streaming data workflows within the Azure ecosystem. Apply distributed processing techniques using ApacheSpark to handle large datasets effectively. Help drive improvements in data quality, implementing validation, cleansing, and monitoring frameworks. Contribute to the firm’s efforts around data security, governance More ❯
agile environment to deliver data solutions that support key firm initiatives. Build scalable and efficient batch and streaming data workflows within the Azure ecosystem. Apply distributed processing techniques using ApacheSpark to handle large datasets effectively. Help drive improvements in data quality, implementing validation, cleansing, and monitoring frameworks. Contribute to the firm’s efforts around data security, governance More ❯
London, England, United Kingdom Hybrid / WFH Options
Derisk360
in Neo4j such as fraud detection, knowledge graphs, and network analysis. Optimize graph database performance, ensure query scalability, and maintain system efficiency. Manage ingestion of large-scale datasets using Apache Beam, Spark, or Kafka into GCP environments. Implement metadata management, security, and data governance using Data Catalog and IAM. Collaborate with cross-functional teams and clients across diverse More ❯
and SQL for data pipelines Experience with modern cloud data warehouses (like AWS Redshift, GCP BigQuery, Azure Synapse or Snowflake) Strong communication skills and fluency in English Experience with ApacheSpark (in both batch and streaming) Experience with a job orchestrator (Airflow, Google Cloud Composer, Flyte, Prefect, Dagster) Hands-on experience with AWS Experience with dbt *Typeform drives More ❯