About the job Scala Spark Developer - London Opening: Scala Spark Developer (Hybrid from London) Client Introduction: The company is a multinational Sweden SAAS product-based firm. Company Strength: 100+ Job Description: The Scala Spark Developer at HCL will be responsible for leading technical teams and projects related to apachespark, scala, and python. The role … involves overseeing the design, development, and implementation of scalable and efficient solutions using these technologies. Key Responsibilities 1. Lead technical teams in the design and implementation of solutions using apachespark, scala, and python 2. Provide technical expertise and guidance to team members in resolving complex technical issues 3. Collaborate with stakeholders to gather requirements and define project … 5. Conduct code reviews and performance optimization activities 6. Troubleshoot and debug technical issues to ensure seamless project delivery 7. Stay updated with the latest trends and advancements in apachespark, scala, and python technologies 8. Mentor team members and facilitate knowledge sharing within the team Skill Requirements 1. Strong proficiency in apachespark, scala, and More ❯
About the job Scala Spark Developer - London Opening: Scala Spark Developer (Hybrid from London) Client Introduction: The company is a multinational Sweden SAAS product-based firm. Company Strength: 100+ Job Description: The Scala Spark Developer at HCL will be responsible for leading technical teams and projects related to apachespark, scala, and python. The role … involves overseeing the design, development, and implementation of scalable and efficient solutions using these technologies. Key Responsibilities 1. Lead technical teams in the design and implementation of solutions using apachespark, scala, and python 2. Provide technical expertise and guidance to team members in resolving complex technical issues 3. Collaborate with stakeholders to gather requirements and define project … 5. Conduct code reviews and performance optimization activities 6. Troubleshoot and debug technical issues to ensure seamless project delivery 7. Stay updated with the latest trends and advancements in apachespark, scala, and python technologies 8. Mentor team members and facilitate knowledge sharing within the team Skill Requirements 1. Strong proficiency in apachespark, scala, and More ❯
robust way possible! Diverse training opportunities and social benefits (e.g. UK pension schema) What do you offer? Strong hands-on experience working with modern Big Data technologies such as ApacheSpark, Trino, Apache Kafka, Apache Hadoop, Apache HBase, Apache Nifi, Apache Airflow, Opensearch Proficiency in cloud-native technologies such as containerization and Kubernetes More ❯
privacy, and security, ensuring our AI systems are developed and used responsibly and ethically. Tooling the Future: Get hands-on with cutting-edge technologies like Hugging Face, PyTorch, TensorFlow, ApacheSpark, Apache Airflow, and other modern data and ML frameworks. Collaborate and Lead: Partner closely with ML Engineers, Data Scientists, and Researchers to understand their data needs … their data, compute, and storage services. Programming Prowess: Strong programming skills in Python and SQL are essential. Big Data Ecosystem Expertise: Hands-on experience with big data technologies like ApacheSpark, Kafka, and data orchestration tools such as Apache Airflow or Prefect. ML Data Acumen: Solid understanding of data requirements for machine learning models, including feature engineering More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Searchability
position, you'll develop and maintain a mix of real-time and batch ETL processes, ensuring accuracy, integrity, and scalability across vast datasets. You'll work with Python, SQL, ApacheSpark, and AWS services such as EMR, Athena, and Lambda to deliver robust, high-performance solutions.You'll also play a key role in optimising data pipeline architecture, supporting … Proven experience as a Data Engineer, with Python & SQL expertise Familiarity with AWS services (or equivalent cloud platforms) Experience with large-scale datasets and ETL pipeline development Knowledge of ApacheSpark (Scala or Python) beneficial Understanding of agile development practices, CI/CD, and automated testing Strong problem-solving and analytical skills Positive team player with excellent communication … required skills) your application to our client in conjunction with this vacancy only. KEY SKILLS:Data Engineer/Python/SQL/AWS/ETL/Data Pipelines/ApacheSpark/EMR/Athena/Lambda/Big Data/Manchester/Hybrid Working More ❯
Expertise in data warehousing, data modelling, and data integration. Experience in MLOps and machine learning pipelines. Proficiency in SQL and data manipulation languages. Experience with big data platforms (including Apache Arrow, ApacheSpark, Apache Iceberg, and Clickhouse) and cloud-based infrastructure on AWS. Education & Qualifications Bachelors or Masters degree in Computer Science, Engineering, or a related More ❯
extract data from diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging … SQL for data manipulation and scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines, version control systems like Git, and containerization (e.g., Docker). Experience … with ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Strong understanding of data governance and best practices in data management. Experience with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL More ❯
extract data from diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging … SQL for data manipulation and scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines and version control systems like Git. Knowledge of ETL tools and … technologies such as Apache Airflow, Informatica, or Talend. Knowledge of data governance and best practices in data management. Familiarity with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the ability to diagnose and resolve complex data-related issues. SQL (for database management and querying More ❯
data-based insights, collaborating closely with stakeholders. Passionately discover hidden solutions in large datasets to enhance business outcomes. Design, develop, and maintain data processing pipelines using Cloudera technologies, including Apache Hadoop, ApacheSpark, Apache Hive, and Python. Collaborate with data engineers and scientists to translate data requirements into technical specifications. Develop and maintain frameworks for efficient More ❯
in either Python or Scala Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals Familiarity with CI/CD for production deployments Working knowledge of MLOps Design and deployment of performant end-to-end … Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that More ❯
technologies Azure, AWS, GCP, Snowflake, Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (ApacheSpark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis/MQ/Event Hubs … skills. A minimum of 5 years experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apachespark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF Designing Databricks based solutions for More ❯
services such as S3, Glue, Lambda, Redshift, EMR, Kinesis, and more-covering data pipelines, warehousing, and lakehouse architectures. Drive the migration of legacy data workflows to Lakehouse architectures, leveraging Apache Iceberg to enable unified analytics and scalable data management. Operate as a subject matter expert across multiple data projects, providing strategic guidance on best practices in design, development, and … in designing and implementing scalable data engineering solutions. Bring extensive experience in software architecture and solution design, ensuring robust and future-proof systems. Hold specialised proficiency in Python and ApacheSpark, enabling efficient processing of large-scale data workloads. Demonstrate the ability to set technical direction, uphold high standards for code quality, and optimise performance in data-intensive … of continuous learning and innovation. Extensive background in software architecture and solution design, with deep expertise in microservices, distributed systems, and cloud-native architectures. Advanced proficiency in Python and ApacheSpark, with a strong focus on ETL data processing and scalable data engineering workflows. In-depth technical knowledge of AWS data services, with hands-on experience implementing data More ❯
and/or demonstrated competence in OLTP systems along with one of Azure, AWS or GCP cloud providers Demonstrated competence in the Lakehouse architecture including hands-on experience with ApacheSpark, Python and SQL Excellent communication skills; both written and verbal Experience in pre-sales selling highly desired About Databricks Databricks is the data and AI company. More … Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that More ❯
and streaming data pipelines Azure Purview or equivalent for data governance and lineage tracking Experience with data integration, MDM, governance, and data quality tools . Hands-on experience with ApacheSpark, Python, SQL, and Scala for data processing. Strong understanding of Azure networking, security, and IAM , including Azure Private Link, VNETs, Managed Identities, and RBAC . Deep knowledge … for scalable data lakes Azure Purview or equivalent for data governance and lineage tracking Experience with data integration, MDM, governance, and data quality tools . Hands-on experience with ApacheSpark, Python, SQL, and Scala for data processing. Strong understanding of Azure networking, security, and IAM , including Azure Private Link, VNETs, Managed Identities, and RBAC . Deep knowledge More ❯
focused data team responsible for building and optimising scalable, production-grade data pipelines and infrastructure. Key Responsibilities: Design and implement robust, scalable ETL/ELT pipelines using Databricks and ApacheSpark Ingest, transform, and manage large volumes of data from diverse sources Collaborate with analysts, data scientists, and business stakeholders to deliver clean, accessible datasets Ensure high performance … practices Work with cloud-native tools and services (preferably Azure ) Required Skills & Experience: Proven experience as a Data Engineer on cloud-based projects Strong hands-on skills with Databricks , ApacheSpark , and Python or Scala Proficient in SQL and working with large-scale data environments Experience with Delta Lake , Azure Data Lake , or similar technologies Familiarity with version More ❯
/or teaching technical concepts to non-technical and technical audiences alike Passion for collaboration, life-long learning, and driving business value through ML [Preferred] Experience working with Databricks & ApacheSpark to process large-scale distributed datasets About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide including Comcast, Cond Nast, Grammarly, and … Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide comprehensive benefits and perks that More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Randstad Technologies
scalable data pipelines, specifically using the Hadoop ecosystem and related tools. The role will focus on designing, building and maintaining scalable data pipelines using big data hadoop ecosystems and apachespark for large datasets. A key responsibility is to analyse infrastructure logs and operational data to derive insights, demonstrating a strong understanding of both data processing and the … underlying systems. The successful candidate should have the following key skills Experience with Open Data Platform Hands on experience with Python for Scripting ApacheSpark Prior experience of building ETL pipelines Data Modelling 6 Months Contract - Remote Working - £300 to £350 a day Inside IR35 If you are an experienced Hadoop engineer looking for a new role then More ❯
company covering the entire data transformation from architecture to implementation. Beyond delivering solutions, we also provide data & AI training and enablement. We are backed by Databricks - the creators of ApacheSpark, and act as a delivery partner and training provider for them in Europe. Additionally, we are Microsoft Gold Partners in delivering cloud migration and data architecture on … company covering the entire data transformation from architecture to implementation. Beyond delivering solutions, we also provide data & AI training and enablement. We are backed by Databricks - the creators of ApacheSpark, and act as a delivery partner and training provider for them in Europe. Additionally, we are Microsoft Gold Partners in delivering cloud migration and data architecture on More ❯
Role Title: Infrastructure/Platform Engineer - Apache Duration: 9 Months Location: Remote Rate: £ - Umbrella only Would you like to join a global leader in consulting, technology services and digital transformation? Our client is at the forefront of innovation to address the entire breadth of opportunities in the evolving world of cloud, digital and platforms. Role purpose/summary ? Refactor … prototype Spark jobs into production-quality components, ensuring scalability, test coverage, and integration readiness. ? Package Spark workloads for deployment via Docker/Kubernetes and integrate with orchestration systems (e.g., Airflow, custom schedulers). ? Work with platform engineers to embed Spark jobs into InfoSum's platform APIs and data pipelines. ? Troubleshoot job failures, memory and resource issues, and … execution anomalies across various runtime environments. ? Optimize Spark job performance and advise on best practices to reduce cloud compute and storage costs. ? Guide engineering teams on choosing the right execution strategies across AWS, GCP, and Azure. ? Provide subject matter expertise on using AWS Glue for ETL workloads and integration with S3 and other AWS-native services. ? Implement observability tooling More ❯
Role Title: Infrastructure/Platform Engineer - Apache Duration: 9 Months Location: Remote Rate: £ - Umbrella only Would you like to join a global leader in consulting, technology services and digital transformation? Our client is at the forefront of innovation to address the entire breadth of opportunities in the evolving world of cloud, digital and platforms. Role purpose/summary ? Refactor … prototype Spark jobs into production-quality components, ensuring scalability, test coverage, and integration readiness. ? Package Spark workloads for deployment via Docker/Kubernetes and integrate with orchestration systems (e.g., Airflow, custom schedulers). ? Work with platform engineers to embed Spark jobs into InfoSum's platform APIs and data pipelines. ? Troubleshoot job failures, memory and resource issues, and … execution anomalies across various runtime environments. ? Optimize Spark job performance and advise on best practices to reduce cloud compute and storage costs. ? Guide engineering teams on choosing the right execution strategies across AWS, GCP, and Azure. ? Provide subject matter expertise on using AWS Glue for ETL workloads and integration with S3 and other AWS-native services. ? Implement observability tooling More ❯
Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Synechron Synechron are seeking a skilled Machine Learning Developer with expertise in Spark ML, predictive modeling, and deploying training and inference pipelines on distributed systems such as Hadoop. The ideal candidate will design, implement, and optimize machine learning solutions for large-scale … data processing and predictive analytics. Responsibilities Develop and implement machine learning models using Spark ML for predictive analytics. Design and optimize training and inference pipelines for distributed systems (e.g., Hadoop). Process and analyze large-scale datasets to extract meaningful insights and features. Collaborate with data engineers to ensure seamless integration of ML workflows with data pipelines. Evaluate model … time and batch inference. Monitor and troubleshoot deployed models to ensure reliability and performance. Stay updated with advancements in machine learning frameworks and distributed computing technologies. Requirements: Proficiency in ApacheSpark and Spark MLlib for machine learning tasks. Strong understanding of predictive modeling techniques (e.g., regression, classification, clustering). Experience with distributed systems like Hadoop for data More ❯
Synechron is looking for a skilled Machine Learning Developer with expertise in Spark ML to work with a leading financial organisation on a global programme of work. The role involves predictive modeling, and deploying training and inference pipelines on distributed systems such as Hadoop. The ideal candidate will design, implement, and optimise machine learning solutions for large-scale data … processing and predictive analytics. Role: Develop and implement machine learning models using Spark ML for predictive analytics Design and optimise training and inference pipelines for distributed systems (e.g., Hadoop) Process and analyse large-scale datasets to extract meaningful insights and features Collaborate with data engineers to ensure seamless integration of ML workflows with data pipelines Evaluate model performance and … time and batch inference Monitor and troubleshoot deployed models to ensure reliability and performance Stay updated with advancements in machine learning frameworks and distributed computing technologies Experience: Proficiency in ApacheSpark and Spark MLlib for machine learning tasks Strong understanding of predictive modeling techniques (e.g., regression, classification, clustering) Experience with distributed systems like Hadoop for data storage More ❯
Skills: Proven expertise in designing, building, and operating data pipelines, warehouses, and scalable data architectures. Deep hands-on experience with modern data stacks. Our tech includes Python, SQL, Snowflake, Apache Iceberg, AWS S3, PostgresDB, Airflow, dbt, and ApacheSpark, deployed via AWS, Docker, and Terraform. Experience with similar technologies is essential. Coaching & Growth Mindset: Passion for developing More ❯
Skills: Proven expertise in designing, building, and operating data pipelines, warehouses, and scalable data architectures. Deep hands-on experience with modern data stacks. Our tech includes Python, SQL, Snowflake, Apache Iceberg, AWS S3, PostgresDB, Airflow, dbt, and ApacheSpark, deployed via AWS, Docker, and Terraform. Experience with similar technologies is essential. Coaching & Growth Mindset: Passion for developing More ❯
end tech specs and modular architectures for ML frameworks in complex problem spaces in collaboration with product teams Experience with large scale, distributed data processing frameworks/tools like Apache Beam, ApacheSpark, and cloud platforms like GCP or AWS Experience with technologies such as Kubernetes, Ray is a plus Experience troubleshooting model training and deployment across More ❯