data architectures, such as lakehouse. Experience with CI/CD pipelines, version control systems like Git, and containerization (e.g., Docker). Experience with ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Strong understanding of data governance and best practices in data management. Experience with cloud platforms and services such as AWS, Azure, or GCP for deploying … and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL (for database management and querying) Apache Spark (for distributed data processing) Apache Spark Streaming, Kafka or similar (for real-time data streaming) Experience using data tools in at least one cloud service - AWS, Azure or GCP More ❯
ability to effectively collaborate with stakeholders at all levels, provide training, and solicit feedback. Preferred qualifications, capabilities, and skills Experience with big-data technologies, such as Splunk, Trino, and Apache Iceberg. Data Science experience. AI/ML experience with building models. AWS certification (e.g., AWS Certified Solutions Architect, AWS Certified Developer). About Us J.P. Morgan is a global More ❯
warehouse solutions for BI and analytics. Define and drive the long-term architecture and data strategy in alignment with business goals. Own orchestration of ETL/ELT workflows using Apache Airflow , including scheduling, monitoring, and alerting. Collaborate with cross-functional teams (Product, Engineering, Data Science, Compliance) to define data requirements and build reliable data flows. Champion best practices in … Proven experience designing and delivering enterprise data strategies . Exceptional communication and stakeholder management skills. Expertise in enterprise-grade data warehouses (Snowflake, BigQuery, Redshift). Hands-on experience with Apache Airflow (or similar orchestration tools). Strong proficiency in Python and SQL for pipeline development. Deep understanding of data architecture, dimensional modelling, and metadata management. Experience with cloud platforms More ❯
Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines and version control systems like Git. Knowledge of ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Knowledge of data governance and best practices in data management. Familiarity with cloud platforms and services such as AWS, Azure, or GCP for deploying and … managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL (for database management and querying) Apache Spark (for distributed data processing) Apache Spark Streaming, Kafka or similar (for real-time data streaming) Experience using data tools in at least one cloud service - AWS, Azure or GCP (e.g. More ❯
critical. The platform also leverages machine learning to help them to detect trading behaviour that may trigger regulatory inquiries. In terms of the technical stack, this includes Java, Python, Apache Spark (on Serverless EMR), AWS, DynamoDB, S3, SNS/SQS. Experience Required; Strong backend software engineering experience, ideally with distributed systems and large-scale data processing Experience in financial … markets, specifically across trade surveillance or compliance software Strong programming skills in Java (multithreading, concurrency, performance tuning) Deep experience with Apache Spark and Spark Streaming Proficiency with AWS services, ideally including tools such as Lambda, DynamoDB, S3, SNS, SQS, and Serverless EMR Experience with SQL and NoSQL databases Hands-on with Python, especially in data handling (pandas, scikit-learn More ❯
warehouse solutions for BI and analytics. Define and drive the long-term architecture and data strategy in alignment with business goals. Own orchestration of ETL/ELT workflows using Apache Airflow , including scheduling, monitoring, and alerting. Collaborate with cross-functional teams (Product, Engineering, Data Science, Compliance) to define data requirements and build reliable data flows. Champion best practices in … Proven experience designing and delivering enterprise data strategies . Exceptional communication and stakeholder management skills. Expertise in enterprise-grade data warehouses (Snowflake, BigQuery, Redshift). Hands-on experience with Apache Airflow (or similar orchestration tools). Strong proficiency in Python and SQL for pipeline development. Deep understanding of data architecture, dimensional modelling, and metadata management. Experience with cloud platforms More ❯
City of London, London, United Kingdom Hybrid / WFH Options
83data
warehouse solutions for BI and analytics. Define and drive the long-term architecture and data strategy in alignment with business goals. Own orchestration of ETL/ELT workflows using Apache Airflow , including scheduling, monitoring, and alerting. Collaborate with cross-functional teams (Product, Engineering, Data Science, Compliance) to define data requirements and build reliable data flows. Champion best practices in … Proven experience designing and delivering enterprise data strategies . Exceptional communication and stakeholder management skills. Expertise in enterprise-grade data warehouses (Snowflake, BigQuery, Redshift). Hands-on experience with Apache Airflow (or similar orchestration tools). Strong proficiency in Python and SQL for pipeline development. Deep understanding of data architecture, dimensional modelling, and metadata management. Experience with cloud platforms More ❯
award-winning trading and surveillance platform, including TT Trade Surveillance , which leverages machine learning to detect trading behavior that may trigger regulatory inquiries. Our tech stack includes Java, Python, Apache Spark (on Serverless EMR), AWS Lambda, DynamoDB, S3, SNS/SQS , and other cloud-native services. As part of a high-impact engineering team, you'll help design and … problems at scale in a domain where precision, performance, and reliability are critical. What Will You be Involved With? Design and build scalable, distributed systems using Java , Python , and Apache Spark Develop and optimize Spark jobs on AWS Serverless EMR for processing large-scale time-series datasets Build event-driven and batch processing workflows using Lambda , SNS/SQS … to the Table? Strong backend software engineering experience, ideally with distributed systems and large-scale data processing Strong programming skills in Java (multithreading, concurrency, performance tuning) Deep experience with Apache Spark and Spark Streaming Proficiency with AWS services , including Lambda, DynamoDB, S3, SNS, SQS, and Serverless EMR Experience with SQL and NoSQL databases Hands-on with Python , especially in More ❯
Deep understanding in software architecture, object-oriented design principles, and data structures Extensive experience in developing microservices using Java, Python Experience in distributed computing frameworks like - Hive/Hadoop, Apache Spark. Good experience in Test driven development and automating test cases using Java/Python Experience in SQL/NoSQL (Oracle, Cassandra) database design Demonstrated ability to be proactive … HR related applications Experience with following cloud services: AWS Elastic Beanstalk, EC2, S3, CloudFront, RDS, DynamoDB, VPC, Elastic Cache, Lambda Working experience with Terraform Experience in creating workflows for Apache Airflow About Roku Roku pioneered streaming to the TV. We connect users to the streaming content they love, enable content publishers to build and monetize large audiences, and provide More ❯
City of London, London, United Kingdom Hybrid / WFH Options
OTA Recruitment
modern data modelling practices, analytics tooling, and interactive dashboard development in Power BI and Plotly/Dash. Key responsibilities: Designing and maintaining robust data transformation pipelines (ELT) using SQL, Apache Airflow, or similar tools. Building and optimizing data models that power dashboards and analytical tools Developing clear, insightful, and interactive dashboards and reports using Power BI and Plotly/ More ❯
modern data modelling practices, analytics tooling, and interactive dashboard development in Power BI and Plotly/Dash. Key responsibilities: Designing and maintaining robust data transformation pipelines (ELT) using SQL, Apache Airflow, or similar tools. Building and optimizing data models that power dashboards and analytical tools Developing clear, insightful, and interactive dashboards and reports using Power BI and Plotly/ More ❯
In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a … significant impact, we encourage you to apply! Job Responsibilities ETL/ELT Pipeline Development: Design, develop, and optimize efficient and scalable ETL/ELT pipelines using Python, PySpark, and Apache Airflow. Implement batch and real-time data processing solutions using Apache Spark. Ensure data quality, governance, and security throughout the data lifecycle. Cloud Data Engineering: Manage and optimize … effectiveness. Implement and maintain CI/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Develop and optimize large-scale data processing pipelines using Apache Spark and PySpark. Implement data partitioning, caching, and performance tuning techniques to enhance Spark-based workloads. Work with diverse data formats (structured and unstructured) to support advanced analytics and More ❯
City Of London, England, United Kingdom Hybrid / WFH Options
Paul Murphy Associates
support market surveillance and compliance efforts. The platform leverages advanced analytics and machine learning to identify trading behaviors that could trigger regulatory attention. The tech stack includes Java, Python, Apache Spark (on Serverless EMR), AWS Lambda, DynamoDB, S3, SNS/SQS, and other cloud-native tools. You’ll work alongside a high-impact engineering team to build fault-tolerant … data pipelines and services that process massive time-series datasets in both real-time and batch modes. Key Responsibilities: Design and build scalable, distributed systems using Java, Python, and Apache Spark Develop and optimize Spark jobs on AWS Serverless EMR for large-scale time-series processing Build event-driven and batch workflows using AWS Lambda, SNS/SQS, and … and non-technical stakeholders Qualifications: Strong backend software development experience, especially in distributed systems and large-scale data processing Advanced Java programming skills (multithreading, concurrency, performance tuning) Expertise in Apache Spark and Spark Streaming Proficiency with AWS services such as Lambda, DynamoDB, S3, SNS, SQS, and Serverless EMR Experience with SQL and NoSQL databases Hands-on Python experience, particularly More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
Requirements: 3+ Years data engineering experience Snowflake experience Proficiency across an AWS tech stack DevOps experience building and deploying using Terraform Nice to Have: DBT Data Modelling Data Vault Apache Airflow Benefits: Up to 10% Bonus Up to 14% Pensions Contribution 29 Days Annual Leave + Bank Holidays Free Company Shares Interviews ongoing don't miss your chance to More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
Requirements: 3+ Years data engineering experience Snowflake experience Proficiency across an AWS tech stack DevOps experience building and deploying using Terraform Nice to Have: DBT Data Modelling Data Vault Apache Airflow Benefits: Up to 10% Bonus Up to 14% Pensions Contribution 29 Days Annual Leave + Bank Holidays Free Company Shares Interviews ongoing don't miss your chance to More ❯
with interface/API data modeling. Knowledge of CI/CD tools like GitHub Actions or similar. AWS certifications such as AWS Certified Data Engineer. Knowledge of Snowflake, SQL, Apache Airflow, and DBT. Familiarity with Atlan for data cataloging and metadata management. Understanding of iceberg tables. Who we are: We're a global business empowering local teams with exciting More ❯
or MS degree in Computer Science or equivalent Experience in developing Finance or HR related applications Working experience with Tableau Working experience with Terraform Experience in creating workflows for Apache Airflow and Jenkins Benefits Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive More ❯
indexing, partitioning. Hands-on IaC development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with Apache Spark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the Lakehouse Architecture. Experience with data quality More ❯
as System engineers to support both data and application integrations using bespoke tools written in Python/Java, as well as tools such as Meltano, Airflow, Mulesoft/Snaplogic, Apache NIFI, and Kafka, ensuring a robust, well-modelled, and scalable data analytics infrastructure running on MySQL and Postgres style databases primarily. Requirements: Advanced SQL development and deep understanding of … integration (REST/SOAP) Proficiency in at least 1 object/procedural/functional language (e.g: Java, PHP, Python) Familiarity with EAI tools such as MuleSoft/SnapLogic or Apache NiFi Experience with infrastructure-as-code tools such as Terraform and Ansible Experience with version control (e.g. Git, SVN) and CI/CD workflows for deployment Experience scraping external More ❯
systems, with a focus on data quality and reliability. Design and manage data storage solutions, including databases, warehouses, and lakes. Leverage cloud-native services and distributed processing tools (e.g., Apache Flink, AWS Batch) to support large-scale data workloads. Operations & Tooling Monitor, troubleshoot, and optimize data pipelines to ensure performance and cost efficiency. Implement data governance, access controls, and … ELT pipelines and data architectures. Hands-on expertise with cloud platforms (e.g., AWS) and cloud-native data services. Comfortable with big data tools and distributed processing frameworks such as Apache Flink or AWS Batch. Strong understanding of data governance, security, and best practices for data quality. Effective communicator with the ability to work across technical and non-technical teams. … Additional Strengths Experience with orchestration tools like Apache Airflow. Knowledge of real-time data processing and event-driven architectures. Familiarity with observability tools and anomaly detection for production systems. Exposure to data visualization platforms such as Tableau or Looker. Relevant cloud or data engineering certifications. What we offer: A collaborative and transparent company culture founded on Integrity, Innovation and More ❯
In this role, you will be responsible for designing, building, and maintaining robust data pipelines and infrastructure on the Azure cloud platform. You will leverage your expertise in PySpark, Apache Spark, and Apache Airflow to process and orchestrate large-scale data workloads, ensuring data quality, efficiency, and scalability. If you have a passion for data engineering and a … data processing workloads Implement CI/CD pipelines for data workflows to ensure smooth and reliable deployments. Big Data & Analytics: Build and optimize large-scale data processing pipelines using Apache Spark and PySpark Implement data partitioning, caching, and performance tuning for Spark-based workloads. Work with diverse data formats (structured and unstructured) to support advanced analytics and machine learning More ❯
to-end, scalable data and AI solutions using the Databricks Lakehouse (Delta Lake, Unity Catalog, MLflow). Design and lead the development of modular, high-performance data pipelines using Apache Spark and PySpark. Champion the adoption of Lakehouse architecture (bronze/silver/gold layers) to ensure scalable, governed data platforms. Collaborate with stakeholders, analysts, and data scientists to … performance tuning, cost optimisation, and monitoring across data workloads. Mentor engineering teams and support architectural decisions as a recognised Databricks expert. Essential Skills & Experience: Demonstrable expertise with Databricks and Apache Spark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth understanding of Lakehouse concepts, medallion architecture More ❯
ad hoc to scalable, production-grade systems - Collaborate closely with senior stakeholders and shape the future of the data platform - Python | SQL | Snowflake | AWS (S3, Docker, Terraform) | Airflow | DBT | Apache Spark & Iceberg | PostgresDB The ideal candidate: - 5+ years as a senior/principal data engineer - Experience leading or mentoring engineers - Strong technical grounding - Start-up/scale-up mindset More ❯
pipelines Hands-on experience with Agile (Scrum) methodologies Database experience with Oracle and/or MongoDB Experience using the Atlassian suite : Bitbucket, Jira, and Confluence Desirable Skills Knowledge of Apache NiFi Front-end development with React (JavaScript/TypeScript) Working knowledge of Elasticsearch and Kibana Experience developing for cloud environments, particularly AWS (EC2, EKS, Fargate, IAM, S3, Lambda) Understanding More ❯