City of London, Greater London, UK Hybrid / WFH Options
83data
warehouse solutions for BI and analytics. Define and drive the long-term architecture and data strategy in alignment with business goals. Own orchestration of ETL/ELT workflows using Apache Airflow, including scheduling, monitoring, and alerting. Collaborate with cross-functional teams (Product, Engineering, Data Science, Compliance) to define data requirements and build reliable data flows. Champion best practices in … role. Proven experience designing and delivering enterprise data strategies. Exceptional communication and stakeholder management skills. Expertise in enterprise-grade data warehouses (Snowflake, BigQuery, Redshift). Hands-on experience with Apache Airflow (or similar orchestration tools). Strong proficiency in Python and SQL for pipeline development. Deep understanding of data architecture, dimensional modelling, and metadata management. Experience with cloud platforms More ❯
City of London, London, United Kingdom Hybrid / WFH Options
OTA Recruitment
modern data modelling practices, analytics tooling, and interactive dashboard development in Power BI and Plotly/Dash. Key responsibilities: Designing and maintaining robust data transformation pipelines (ELT) using SQL, Apache Airflow, or similar tools. Building and optimizing data models that power dashboards and analytical tools Developing clear, insightful, and interactive dashboards and reports using Power BI and Plotly/ More ❯
City Of London, England, United Kingdom Hybrid / WFH Options
Paul Murphy Associates
support market surveillance and compliance efforts. The platform leverages advanced analytics and machine learning to identify trading behaviors that could trigger regulatory attention. The tech stack includes Java, Python, Apache Spark (on Serverless EMR), AWS Lambda, DynamoDB, S3, SNS/SQS, and other cloud-native tools. You’ll work alongside a high-impact engineering team to build fault-tolerant … data pipelines and services that process massive time-series datasets in both real-time and batch modes. Key Responsibilities: Design and build scalable, distributed systems using Java, Python, and Apache Spark Develop and optimize Spark jobs on AWS Serverless EMR for large-scale time-series processing Build event-driven and batch workflows using AWS Lambda, SNS/SQS, and … and non-technical stakeholders Qualifications: Strong backend software development experience, especially in distributed systems and large-scale data processing Advanced Java programming skills (multithreading, concurrency, performance tuning) Expertise in Apache Spark and Spark Streaming Proficiency with AWS services such as Lambda, DynamoDB, S3, SNS, SQS, and Serverless EMR Experience with SQL and NoSQL databases Hands-on Python experience, particularly More ❯
are critical. The platform also leverages machine learning to help them to detect trading behaviour that may trigger regulatory inquiries. In terms of the technical stack, this includes Java, Apache Spark (on Serverless EMR), AWS, DynamoDB, S3, SNS/SQS. Experience Required; Strong backend software engineering experience, ideally with distributed systems and large-scale data processing Experience in financial … markets, specifically across trade surveillance or compliance software Strong programming skills in Java (multithreading, concurrency, performance tuning) Deep experience with Apache Spark and Spark Streaming Proficiency with cloud, ideally AWS services Experience with SQL and NoSQL databases Any Python experience beneficial, especially in data handling (pandas, scikit-learn, etc.) Familiarity with RESTful web services and event-driven architectures More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
Requirements: 3+ Years data engineering experience Snowflake experience Proficiency across an AWS tech stack DevOps experience building and deploying using Terraform Nice to Have: DBT Data Modelling Data Vault Apache Airflow Benefits: Up to 10% Bonus Up to 14% Pensions Contribution 29 Days Annual Leave + Bank Holidays Free Company Shares Interviews ongoing don't miss your chance to More ❯
engineers + external partners) across complex data and cloud engineering projects Designing and delivering distributed solutions on an AWS-centric stack, with open-source flexibility Working with Databricks, Apache Iceberg, and Kubernetes in a cloud-agnostic environment Guiding architecture and implementation of large-scale data pipelines for structured and unstructured data Steering direction on software stack, best practices, and … especially AWS), and orchestration technologies Proven delivery of big data solutions-not necessarily at FAANG scale, but managing high-volume, complex data (structured/unstructured) Experience working with Databricks, Apache Iceberg, or similar modern data platforms Experience of building software environments from the ground up, setting best practice and standards Experience leading and mentoring teams Worked in a startup More ❯
City of London, Greater London, UK Hybrid / WFH Options
Roc Search
Manage deployments with Helm and configuration in YAML. Develop shell scripts and automation for deployment and operational workflows. Work with Data Engineering to integrate and manage data workflows using Apache Airflow and DAG-based models. Perform comprehensive testing, debugging, and optimization of backend components. Required Skills Bachelor's degree in Computer Science, Software Engineering, or a related field (or … and YAML for defining deployment configurations and managing releases. Proficiency in shell scripting for automating deployment and maintenance tasks. Understanding of DAG (Directed Acyclic Graph) models and experience with Apache Airflow for managing complex data processing workflows. Familiarity with database systems (SQL and NoSQL) and proficiency in writing efficient queries. Solid understanding of software development best practices, including version More ❯
data programs. 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like Apache Airflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop), or … Composer (Apache Airflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large datasets. Expertise More ❯
engineers + external partners) across complex data and cloud engineering projects Designing and delivering distributed solutions on an AWS-centric stack, with open-source flexibility Working with Databricks, Apache Iceberg, and Kubernetes in a cloud-agnostic environment Guiding architecture and implementation of large-scale data pipelines for structured and unstructured data Steering direction on software stack, best practices, and … especially AWS), and orchestration technologies Proven delivery of big data solutions—not necessarily at FAANG scale, but managing high-volume, complex data (structured/unstructured) Experience working with Databricks, Apache Iceberg, or similar modern data platforms Experience of building software environments from the ground up, setting best practice and standards Experience leading and mentoring teams Worked in a startup More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs, Kafka … minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Required Skills: Mandatory Skills [at least 2 Hyperscalers]: GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF. Preferred Skills: Designing Databricks based solutions More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs, Kafka … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills: Designing Databricks based solutions More ❯
in Microsoft Fabric and Databricks, including data pipeline development, data warehousing, and data lake management Proficiency in Python, SQL, Scala, or Java Experience with data processing frameworks such as Apache Spark, Apache Beam, or Azure Data Factory Strong understanding of data architecture principles, data modelling, and data governance Experience with cloud-based data platforms, including Azure and or More ❯
City of London, Greater London, UK Hybrid / WFH Options
SGI
for deployment and workflow orchestration Solid understanding of financial data and modelling techniques (preferred) Excellent analytical, communication, and problem-solving skills Experience with data engineering & ETL tools such as Apache Airflow or custom ETL scripts. Strong problem-solving skills with a keen analytical mindset especially in handling large data sets and complex data transformations. Strong experience in setting up More ❯
teams • Mentor junior developers Requirements: • British-born sole UK National with active SC or DV Clearance • Strong Java skills, familiarity with Python • Experience in Linux, Git, CI/CD, Apache NiFi • Knowledge of Oracle, MongoDB, React, Elasticsearch • Familiarity with AWS (EC2, EKS, Fargate, S3, Lambda) Active DV Clearance If you do not meet all requirements still feel free to More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Signify Technology
data loads, and data pipeline monitoring. Develop and optimise data pipelines for integrating structured and unstructured data from various internal and external sources. Leverage big data technologies such as Apache Spark, Kafka, and Scala to build robust and scalable data processing systems. Write clean, maintainable code in Python or Scala to support data transformation, orchestration, and integration tasks. Work More ❯
environment. Data Modeling, meta data management Git, GitHub, GitLab, Jenkins Regulatory compliance knowledge: Basel, MiFID, GDPR Big Data Cloud security and access controls (IAM, RBAC) Familiarity with Docker, Kubernetes, ApacheMore ❯
. Strong understanding of data security, quality, and governance principles. Excellent communication and collaboration skills across technical and non-technical teams. Bonus Points For: Experience with orchestration tools like Apache Airflow. Familiarity with real-time data processing and event-driven systems. Knowledge of observability and anomaly detection in production environments. Exposure to visualization tools like Tableau or Looker. Relevant More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Owen Thomas | Pending B Corp™
with cloud platforms (GCP preferred). Experience with CI/CD pipelines and version control. Proficiency in data visualisation tools (e.g. Tableau, PowerBI). Exposure to tools like DBT, Apache Airflow, Docker. Experience working with large-scale datasets (terabyte-level or higher). Excellent problem-solving capabilities. Strong communication and collaboration skills. Proficiency in Python and SQL (or similar More ❯
for large datasets. Expertise in BigQuery, including advanced SQL, partitioning, clustering, and performance tuning. Hands-on experience with at least one of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop), or Composer (Apache Airflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation … Git). 5+ years of advanced expertise in Google Cloud data services: Dataproc, Dataflow, Pub/Sub, BigQuery, Cloud Spanner, and Bigtable. Hands-on experience with orchestration tools like Apache Airflow or Cloud Composer. Hands-on experience with one or more of the following GCP data processing services: Dataflow (Apache Beam), Dataproc (Apache Spark/Hadoop), or … Composer (Apache Airflow). Proficiency in at least one scripting/programming language (e.g., Python, Java, Scala) for data manipulation and pipeline development. Scala is mandated in some cases. Deep understanding of data lakehouse design, event-driven architecture, and hybrid cloud data strategies. Strong proficiency in SQL and experience with schema design and query optimization for large datasets. Expertise More ❯
technologies – Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs, Kafka … skills. A minimum of 5 years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Desirable Skills Designing Databricks based solutions More ❯
Substantial experience using tools for statistical modelling of large data sets Some familiarity with data workflow management tools such as Airflow as well as big data technologies such as Apache Spark or other caching and analytics technologies Expertise in model training, Statistics, model evaluation, deployment and optimisation, including RAG-based architectures. More ❯
experience leading data or platform teams in a production environment Proven success with modern data infrastructure: distributed systems, batch and streaming pipelines Hands-on knowledge of tools such as Apache Spark, Kafka, Databricks, DBT or similar Familiarity with data warehousing, ETL/ELT processes, and analytics engineering Programming proficiency in Python, Scala or Java Experience operating in a cloud More ❯
City of London, Greater London, UK Hybrid / WFH Options
Signify Technology
bring: Proven experience managing large-scale data infrastructure or platform teams, ideally in a consumer tech or marketplace environment. Deep understanding of distributed systems and modern data tools (e.g., Apache Spark, Kafka, DBT, Databricks). Experience with both batch and real-time data processing architectures. Strong programming background in Python, Scala, or Java. Familiarity with cloud platforms such as More ❯
City of London, London, Tottenham Court Road, United Kingdom Hybrid / WFH Options
Cathcart Technology
with new methodologies to enhance the user experience. Key skills: ** Senior Data Scientist experience ** Commercial experience in Generative AI and recommender systems ** Strong Python and SQL experience ** Spark/Apache Airflow ** LLM experience ** MLOps experience ** AWS Additional information: This role offers a strong salary of up to £95,000 (Depending on experience/skill) with hybrid working (2 days More ❯
in Python with libraries like TensorFlow, PyTorch, or Scikit-learn for ML, and Pandas, PySpark, or similar for data processing. Experience designing and orchestrating data pipelines with tools like Apache Airflow, Spark, or Kafka. Strong understanding of SQL, NoSQL, and data modeling. Familiarity with cloud platforms (AWS, Azure, GCP) for deploying ML and data solutions. Knowledge of MLOps practices More ❯