privacy, and security, ensuring our AI systems are developed and used responsibly and ethically. Tooling the Future: Get hands-on with cutting-edge technologies like Hugging Face, PyTorch, TensorFlow, ApacheSpark, Apache Airflow, and other modern data and ML frameworks. Collaborate and Lead: Partner closely with ML Engineers, Data Scientists, and Researchers to understand their data needs … their data, compute, and storage services. Programming Prowess: Strong programming skills in Python and SQL are essential. Big Data Ecosystem Expertise: Hands-on experience with big data technologies like ApacheSpark, Kafka, and data orchestration tools such as Apache Airflow or Prefect. ML Data Acumen: Solid understanding of data requirements for machine learning models, including feature engineering More ❯
extract data from diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging … SQL for data manipulation and scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines, version control systems like Git, and containerization (e.g., Docker). Experience … with ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Strong understanding of data governance and best practices in data management. Experience with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL More ❯
extract data from diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging … SQL for data manipulation and scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines and version control systems like Git. Knowledge of ETL tools and … technologies such as Apache Airflow, Informatica, or Talend. Knowledge of data governance and best practices in data management. Familiarity with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL (for database management and querying More ❯
services such as S3, Glue, Lambda, Redshift, EMR, Kinesis, and more-covering data pipelines, warehousing, and lakehouse architectures. Drive the migration of legacy data workflows to Lakehouse architectures, leveraging Apache Iceberg to enable unified analytics and scalable data management. Operate as a subject matter expert across multiple data projects, providing strategic guidance on best practices in design, development, and … in designing and implementing scalable data engineering solutions. Bring extensive experience in software architecture and solution design, ensuring robust and future-proof systems. Hold specialised proficiency in Python and ApacheSpark, enabling efficient processing of large-scale data workloads. Demonstrate the ability to set technical direction, uphold high standards for code quality, and optimise performance in data-intensive … of continuous learning and innovation. Extensive background in software architecture and solution design, with deep expertise in microservices, distributed systems, and cloud-native architectures. Advanced proficiency in Python and ApacheSpark, with a strong focus on ETL data processing and scalable data engineering workflows. In-depth technical knowledge of AWS data services, with hands-on experience implementing data More ❯
and/or demonstrated competence in OLTP systems along with one of Azure, AWS or GCP cloud providers Demonstrated competence in the Lakehouse architecture including hands-on experience with ApacheSpark, Python and SQL Excellent communication skills; both written and verbal Experience in pre-sales selling highly desired About Databricks Databricks is the data and AI company. More … Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that More ❯
and streaming data pipelines Azure Purview or equivalent for data governance and lineage tracking Experience with data integration, MDM, governance, and data quality tools . Hands-on experience with ApacheSpark, Python, SQL, and Scala for data processing. Strong understanding of Azure networking, security, and IAM , including Azure Private Link, VNETs, Managed Identities, and RBAC . Deep knowledge … for scalable data lakes Azure Purview or equivalent for data governance and lineage tracking Experience with data integration, MDM, governance, and data quality tools . Hands-on experience with ApacheSpark, Python, SQL, and Scala for data processing. Strong understanding of Azure networking, security, and IAM , including Azure Private Link, VNETs, Managed Identities, and RBAC . Deep knowledge More ❯
production environments and deliver fixes and improvements Contribute to system design and development planning Learn and apply modern tools such as Python, PowerShell, Snowflake, and potentially streaming tech (e.g., ApacheSpark) Engage with cloud, data warehouse, and modern ETL concepts as the platform evolves Maintain high code quality with unit testing and best practices Take ownership of code … skills, both written and verbal Self-motivated with a proactive approach to continuous learning Familiarity with query performance tuning and execution plans Experience with Python, Snowflake, PowerShell Exposure to ApacheSpark, Databricks, or other streaming platforms Awareness of cloud computing concepts and data warehousing Understanding of EMIR or MiFIR regulatory reporting solutions Why join us Career coaching, mentoring More ❯
production environments and deliver fixes and improvements Contribute to system design and development planning Learn and apply modern tools such as Python, PowerShell, Snowflake, and potentially streaming tech (e.g., ApacheSpark) Engage with cloud, data warehouse, and modern ETL concepts as the platform evolves Maintain high code quality with unit testing and best practices Take ownership of code … skills, both written and verbal Self-motivated with a proactive approach to continuous learning Familiarity with query performance tuning and execution plans Experience with Python, Snowflake, PowerShell Exposure to ApacheSpark, Databricks, or other streaming platforms Awareness of cloud computing concepts and data warehousing Understanding of EMIR or MiFIR regulatory reporting solutions About Us Why join us Career More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Randstad Technologies
scalable data pipelines, specifically using the Hadoop ecosystem and related tools. The role will focus on designing, building and maintaining scalable data pipelines using big data hadoop ecosystems and apachespark for large datasets. A key responsibility is to analyse infrastructure logs and operational data to derive insights, demonstrating a strong understanding of both data processing and the … underlying systems. The successful candidate should have the following key skills Experience with Open Data Platform Hands on experience with Python for Scripting ApacheSpark Prior experience of building ETL pipelines Data Modelling 6 Months Contract - Remote Working - £300 to £350 a day Inside IR35 If you are an experienced Hadoop engineer looking for a new role then More ❯
Role Title: Infrastructure/Platform Engineer - Apache Duration: 9 Months Location: Remote Rate: £ - Umbrella only Would you like to join a global leader in consulting, technology services and digital transformation? Our client is at the forefront of innovation to address the entire breadth of opportunities in the evolving world of cloud, digital and platforms. Role purpose/summary ? Refactor … prototype Spark jobs into production-quality components, ensuring scalability, test coverage, and integration readiness. ? Package Spark workloads for deployment via Docker/Kubernetes and integrate with orchestration systems (e.g., Airflow, custom schedulers). ? Work with platform engineers to embed Spark jobs into InfoSum's platform APIs and data pipelines. ? Troubleshoot job failures, memory and resource issues, and … execution anomalies across various runtime environments. ? Optimize Spark job performance and advise on best practices to reduce cloud compute and storage costs. ? Guide engineering teams on choosing the right execution strategies across AWS, GCP, and Azure. ? Provide subject matter expertise on using AWS Glue for ETL workloads and integration with S3 and other AWS-native services. ? Implement observability tooling More ❯
Role Title: Infrastructure/Platform Engineer - Apache Duration: 9 Months Location: Remote Rate: £ - Umbrella only Would you like to join a global leader in consulting, technology services and digital transformation? Our client is at the forefront of innovation to address the entire breadth of opportunities in the evolving world of cloud, digital and platforms. Role purpose/summary ? Refactor … prototype Spark jobs into production-quality components, ensuring scalability, test coverage, and integration readiness. ? Package Spark workloads for deployment via Docker/Kubernetes and integrate with orchestration systems (e.g., Airflow, custom schedulers). ? Work with platform engineers to embed Spark jobs into InfoSum's platform APIs and data pipelines. ? Troubleshoot job failures, memory and resource issues, and … execution anomalies across various runtime environments. ? Optimize Spark job performance and advise on best practices to reduce cloud compute and storage costs. ? Guide engineering teams on choosing the right execution strategies across AWS, GCP, and Azure. ? Provide subject matter expertise on using AWS Glue for ETL workloads and integration with S3 and other AWS-native services. ? Implement observability tooling More ❯
end tech specs and modular architectures for ML frameworks in complex problem spaces in collaboration with product teams Experience with large scale, distributed data processing frameworks/tools like Apache Beam, ApacheSpark, and cloud platforms like GCP or AWS Experience with technologies such as Kubernetes, Ray is a plus Experience troubleshooting model training and deployment across More ❯
platform components. Big Data Architecture: Build and maintain big data architectures and data pipelines to efficiently process large volumes of geospatial and sensor data. Leverage technologies such as Hadoop, ApacheSpark, and Kafka to ensure scalability, fault tolerance, and speed. Geospatial Data Integration: Develop systems that integrate geospatial data from a variety of sources (e.g., satellite imagery, remote … driven applications. Familiarity with geospatial data formats (e.g., GeoJSON, Shapefiles, KML) and tools (e.g., PostGIS, GDAL, GeoServer). Technical Skills: Expertise in big data frameworks and technologies (e.g., Hadoop, Spark, Kafka, Flink) for processing large datasets. Proficiency in programming languages such as Python, Java, or Scala, with a focus on big data frameworks and APIs. Experience with cloud services … or related field. Experience with data visualization tools and libraries (e.g., Tableau, D3.js, Mapbox, Leaflet) for displaying geospatial insights and analytics. Familiarity with real-time stream processing frameworks (e.g., Apache Flink, Kafka Streams). Experience with geospatial data processing libraries (e.g., GDAL, Shapely, Fiona). Background in defense, national security, or environmental monitoring applications is a plus. Compensation and More ❯
two of the following: Python, SQL, Java Commercial experience in client-facing projects is a plus, especially within multi-disciplinary teams Deep knowledge of database technologies: Distributed systems (e.g., Spark, Hadoop, EMR) RDBMS (e.g., SQL Server, Oracle, PostgreSQL, MySQL) NoSQL (e.g., MongoDB, Cassandra, DynamoDB, Neo4j) Solid understanding of software engineering best practices - code reviews, testing frameworks, CI/CD More ❯
a Senior Data Engineer, Tech Lead, Data Engineering Manager etc. Proven success with modern data infrastructure: distributed systems, batch and streaming pipelines Hands-on knowledge of tools such as ApacheSpark, Kafka, Databricks, DBT or similar Experience building, defining, and owning data models, data lakes, and data warehouses Programming proficiency in Python, Pyspark, Scala or Java. Experience operating More ❯
Employment Type: Permanent
Salary: £80000 - £95000/annum Attractive Bonus and Benefits
West London, London, United Kingdom Hybrid / WFH Options
Young's Employment Services Ltd
a Senior Data Engineer, Tech Lead, Data Engineering Manager etc. Proven success with modern data infrastructure: distributed systems, batch and streaming pipelines Hands-on knowledge of tools such as ApacheSpark, Kafka, Databricks, DBT or similar Experience building, defining, and owning data models, data lakes, and data warehouses Programming proficiency in Python, Pyspark, Scala or Java. Experience operating More ❯
West London, London, United Kingdom Hybrid / WFH Options
Young's Employment Services Ltd
a Senior Data Engineer, Tech Lead, Data Engineering Manager etc. Proven success with modern data infrastructure: distributed systems, batch and streaming pipelines Hands-on knowledge of tools such as ApacheSpark, Kafka, Databricks, DBT or similar Experience building, defining, and owning data models, data lakes, and data warehouses Programming proficiency in Python, Pyspark, Scala or Java. Experience operating More ❯
science use-cases across various industries Design and develop feature engineering pipelines, build ML & AI infrastructure, deploy models, and orchestrate advanced analytical insights Write code in SQL, Python, and Spark following software engineering best practices Collaborate with stakeholders and customers to ensure successful project delivery Who we are looking for We are looking for collaborative individuals who want to More ❯
model deployment. - Experience with Infrastructure as Code (IaC) by tools such as CDK. - Experience with streaming data processing and real-time analytics. - Experience with big data technologies (e.g., Hadoop, Spark, Hive). Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the More ❯
cloud platforms (AWS, GCP, or Azure) • Experience with: o Data warehousing and lake architectures o ETL/ELT pipeline development o SQL and NoSQL databases o Distributed computing frameworks (Spark, Kinesis etc) o Software development best practices including CI/CD, TDD and version control. o Containerisation tools like Docker or Kubernetes o Experience with Infrastructure as Code tools More ❯
of Relational Databases and Data Warehousing concepts. Experience of Enterprise ETL tools such as Informatica, Talend, Datastage or Alteryx. Project experience using the any of the following technologies: Hadoop, Spark, Scala, Oracle, Pega, Salesforce. Cross and multi-platform experience. Team building and leading. You must be: Willing to work on client sites, potentially for extended periods. Willing to travel More ❯
Maths or similar Science or Engineering discipline Strong Python and other programming skills (Java and/or Scala desirable) Strong SQL background Some exposure to big data technologies (Hadoop, spark, presto, etc.) NICE TO HAVES OR EXCITED TO LEARN: Some experience designing, building and maintaining SQL databases (and/or NoSQL) Some experience with designing efficient physical data models More ❯
Learning (ML): deploying and managing models, creating inference pipelines, and ML Ops practices. Knowledge of ML platforms such as Airflow, SageMaker, Kubeflow, or MLFlow. Experience with distributed computing (e.g., Spark/PySpark). Understanding of cloud ML deployment and model serving on platforms like AWS, Azure, or GCP. Experience with Large Language Models (LLMs), model fine-tuning, and Retrieval More ❯
Programming Mastery: Advanced skills in Python or another major language; writing clean, testable, production-grade ETL code at scale. Modern Data Pipelines: Experience with batch and streaming frameworks (e.g., ApacheSpark, Flink, Kafka Streams, Beam), including orchestration via Airflow, Prefect or Dagster. Data Modeling & Schema Management: Demonstrated expertise in designing, evolving, and documenting schemas (OLAP/OLTP, dimensional More ❯
or OpenCV. Knowledge of ML model serving infrastructure (TensorFlow Serving, TorchServe, MLflow). Knowledge of WebGL, Canvas API, or other graphics programming technologies. Familiarity with big data technologies (Kafka, Spark, Hadoop) and data engineering practices. Background in computer graphics, media processing, or VFX pipeline development. Experience with performance profiling, system monitoring, and observability tools. Understanding of network protocols, security More ❯