london (city of london), south east england, united kingdom
Capgemini
Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using Apache Spark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and … maintain data pipelines and ETL processes using Apache Spark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross-functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Robert Half
monitor machine learning models for anomaly detection and failure prediction. Analyze sensor data and operational logs to support predictive maintenance strategies. Develop and maintain data pipelines using tools like Apache Airflow for efficient workflows. Use MLflow for experiment tracking, model versioning, and deployment management. Contribute to data cleaning, feature engineering, and model evaluation processes. Collaborate with engineers and data … science libraries (Pandas, Scikit-learn, etc.). Solid understanding of machine learning concepts and algorithms . Interest in working with real-world industrial or sensor data . Exposure to Apache Airflow and/or MLflow (through coursework or experience) is a plus. A proactive, analytical mindset with a willingness to learn and collaborate. Why Join Us Work on meaningful More ❯
guidance to cross-functional teams, ensuring best practices in data architecture, security and cloud computing Proficiency in data modelling, ETL processes, data warehousing, distributed systems and metadata systems Utilise Apache Flink and other streaming technologies to build real-time data processing systems that handle large-scale, high-throughput data Ensure all data solutions comply with industry standards and government … but not limited to EC2, S3, RDS, Lambda and Redshift. Experience with other cloud providers (e.g., Azure, GCP) is a plus In-depth knowledge and hands-on experience with Apache Flink for real-time data processing Proven experience in mentoring and managing teams, with a focus on developing talent and fostering a collaborative work environment Strong ability to engage More ❯