and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
london (city of london), south east england, united kingdom
Vallum Associates
and contribute to technical roadmap planning Technical Skills: Great SQL skills with experience in complex query optimization Strong Python programming skills with experience in data processing libraries (pandas, NumPy, ApacheSpark) Hands-on experience building and maintaining data ingestion pipelines Proven track record of optimising queries, code, and system performance Experience with open-source data processing frameworks (ApacheSpark, Apache Kafka, Apache Airflow) Knowledge of distributed computing concepts and big data technologies Experience with version control systems (Git) and CI/CD practices Experience with relational databases (PostgreSQL, MySQL or similar) Experience with containerization technologies (Docker, Kubernetes) Experience with data orchestration tools (Apache Airflow or Dagster) Understanding of data warehousing concepts and dimensional More ❯
robust way possible! Diverse training opportunities and social benefits (e.g. UK pension schema) What do you offer? Strong hands-on experience working with modern Big Data technologies such as ApacheSpark, Trino, Apache Kafka, Apache Hadoop, Apache HBase, Apache Nifi, Apache Airflow, Opensearch Proficiency in cloud-native technologies such as containerization and Kubernetes More ❯
Data Engineer - Azure Databricks , Apache Kafka Permanent Basingstoke (Hybrid - x2 PW) Circa £70,000 + Excellent Package Overview We're looking for a skilled Data Analytics Engineer to help drive the evolution of our clients data platform. This role is ideal for someone who thrives on building scalable data solutions and is confident working with modern tools such as … Azure Databricks , Apache Kafka , and Spark . In this role, you'll play a key part in designing, delivering, and optimising data pipelines and architectures. Your focus will be on enabling robust data ingestion and transformation to support both operational and analytical use cases. If you're passionate about data engineering and want to make a meaningful impact … in a collaborative, fast-paced environment, we want to hear from you !! Role and Responsibilities Designing and building scalable data pipelines using ApacheSpark in Azure Databricks Developing real-time and batch data ingestion workflows, ideally using Apache Kafka Collaborating with data scientists, analysts, and business stakeholders to build high-quality data products Supporting the deployment and More ❯
experience in a leadership or technical lead role, with official line management responsibility. Strong experience with modern data stack technologies, including Python, Snowflake, AWS (S3, EC2, Terraform), Airflow, dbt, ApacheSpark, Apache Iceberg, and Postgres. Skilled in balancing technical excellence with business priorities in a fast-paced environment. Strong communication and stakeholder management skills, able to translate More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
london (city of london), south east england, united kingdom
Capgemini
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. Your Role We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. Key Responsibilities: Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
contract assignment. Key Requirements: Proven background in AI and data development Strong proficiency in Python , including data-focused libraries such as Pandas, NumPy, and PySpark Hands-on experience with ApacheSpark (PySpark preferred) Solid understanding of data management and processing pipelines Experience in algorithm development and graph data structures is advantageous Active SC Clearance is mandatory Role Overview … You will play a key role in developing and delivering advanced AI solutions for a Government client . Responsibilities include: Designing, building, and maintaining data processing pipelines using ApacheSpark Implementing ETL/ELT workflows for large-scale data sets Developing and optimising Python-based data ingestion tools Collaborating on the design and deployment of machine learning models … performance across distributed systems Contributing to data architecture and storage strategy design Working with cloud data platforms (AWS, Azure, or GCP) to deploy scalable solutions Monitoring, troubleshooting, and tuning Spark jobs for performance and cost efficiency Engaging regularly with customers and internal stakeholders This is an excellent opportunity to join a high-profile organisation on a long-term contract More ❯
In order to be successful, you will have the following experience: Extensive AI & Data Development background Experiences with Python (including data libraries such as Pandas, NumPy, and PySpark) and ApacheSpark (PySpark preferred) Strong experience with data management and processing pipelines Algorithm development and knowledge of graphs will be beneficial SC Clearance is essential Within this role, you … will be responsible for: Supporting the development and delivery of AI solution to a Government customer Design, develop, and maintain data processing pipelines using ApacheSpark Implement ETL/ELT workflows to extract, transform and load large-scale datasets efficiently Develop and optimize Python-based applications for data ingestion Collaborate on development of machine learning models Ensure data … to the design of data architectures, storage strategies, and processing frameworks Work with cloud data platforms (e.g., AWS, Azure, or GCP) to deploy scalable solutions Monitor, troubleshoot, and optimize Spark jobs for performance and cost efficiency Liaise with customer and internal stakeholders on a regular basis This represents an excellent opportunity to secure a long term contract, within a More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. YOUR ROLE We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. YOUR PROFILE Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. YOUR ROLE We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. YOUR PROFILE Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world. YOUR ROLE We are looking for a skilled Spark/Scala Developer to join our data engineering team. The ideal candidate will have hands-on experience in designing, developing, and maintaining large-scale data processing pipelines using ApacheSpark and Scala. You will work closely with data scientists, analysts, and engineers to build efficient data solutions and enable data-driven decision-making. YOUR PROFILE Develop, optimize, and maintain data pipelines and ETL processes using ApacheSpark and Scala. Design scalable and robust data processing solutions for batch and real-time data. Collaborate with cross … functional teams to gather requirements and translate them into technical specifications. Perform data ingestion, transformation, and cleansing from various structured and unstructured sources. Monitor and troubleshoot Spark jobs, ensuring high performance and reliability. Write clean, maintainable, and well-documented code. Participate in code reviews, design discussions, and agile ceremonies. Implement data quality and governance best practices. Stay updated with More ❯
optimizing scalable data solutions using the Databricks platform. Key Responsibilities: Lead the migration of existing AWS-based data pipelines to Databricks. Design and implement scalable data engineering solutions using ApacheSpark on Databricks. Collaborate with cross-functional teams to understand data requirements and translate them into efficient pipelines. Optimize performance and cost-efficiency of Databricks workloads. Develop and … best practices for data governance, security, and access control within Databricks. Provide technical mentorship and guidance to junior engineers. Must-Have Skills: Strong hands-on experience with Databricks and ApacheSpark (preferably PySpark). Proven track record of building and optimizing data pipelines in cloud environments. Experience with AWS services such as S3, Glue, Lambda, Step Functions, Athena More ❯
understanding of data modelling, warehousing, and performance optimisation. Proven experience with cloud platforms (AWS, Azure, or GCP) and their data services. Hands-on experience with big data frameworks (e.g. ApacheSpark, Hadoop). Strong knowledge of data governance, security, and compliance. Ability to lead technical projects and mentor junior engineers. Excellent problem-solving skills and experience in agile More ❯
understanding of data modelling, warehousing, and performance optimisation. Proven experience with cloud platforms (AWS, Azure, or GCP) and their data services. Hands-on experience with big data frameworks (e.g. ApacheSpark, Hadoop). Strong knowledge of data governance, security, and compliance. Ability to lead technical projects and mentor junior engineers. Excellent problem-solving skills and experience in agile More ❯
modelling tools, data warehousing, ETL processes, and data integration techniques. Experience with at least one cloud data platform (e.g. AWS, Azure, Google Cloud) and big data technologies (e.g., Hadoop, Spark). Strong knowledge of data workflow solutions like Azure Data Factory, Apache NiFi, Apache Airflow etc Good knowledge of stream and batch processing solutions like Apache Flink, Apache Kafka Good knowledge of log management, monitoring, and analytics solutions like Splunk, Elastic Stack, New Relic etc Given that this is just a short snapshot of the role we encourage you to apply even if you don't meet all the requirements listed above. We are looking for individuals who strive to make an impact and More ❯
pipelines and ETL processes. Proficiency in Python. Experience with cloud platforms (AWS, Azure, or GCP). Knowledge of data modelling, warehousing, and optimisation. Familiarity with big data frameworks (e.g. ApacheSpark, Hadoop). Understanding of data governance, security, and compliance best practices. Strong problem-solving skills and experience working in agile environments. Desirable: Experience with Docker/Kubernetes More ❯
pipelines and ETL processes. Proficiency in Python. Experience with cloud platforms (AWS, Azure, or GCP). Knowledge of data modelling, warehousing, and optimisation. Familiarity with big data frameworks (e.g. ApacheSpark, Hadoop). Understanding of data governance, security, and compliance best practices. Strong problem-solving skills and experience working in agile environments. Desirable: Experience with Docker/Kubernetes More ❯
two of the following: Python, SQL, Java Commercial experience in client-facing projects is a plus, especially within multi-disciplinary teams Deep knowledge of database technologies: Distributed systems (e.g., Spark, Hadoop, EMR) RDBMS (e.g., SQL Server, Oracle, PostgreSQL, MySQL) NoSQL (e.g., MongoDB, Cassandra, DynamoDB, Neo4j) Solid understanding of software engineering best practices - code reviews, testing frameworks, CI/CD More ❯