performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like ApacheSpark , Databricks , Apache Pulsar , Apache Airflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and … or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with ApacheSpark , Databricks , or … similar big data platforms for processing large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like Apache Airflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. More ❯
will be working on complex data problems in a challenging and fun environment, using some of the latest Big Data open-source technologies like ApacheSpark, as well as Amazon Web Service technologies including Elastic MapReduce, Athena and Lambda to develop scalable data solutions. Key Responsibilities: Adhering to … positive attitude, willing to help other members of the team. Experience debugging and dealing with failures on business-critical systems. Preferred Qualifications: Exposure to ApacheSpark, Apache Trino, or another big data processing system. Knowledge of streaming data principles and best practices. Understanding of database technologies and More ❯
London, England, United Kingdom Hybrid / WFH Options
BI:PROCSI
automation to ensure successful project delivery, adhering to client timelines and quality standards. Implement and manage real-time and batch data processing frameworks (e.g., Apache Kafka, ApacheSpark, Google Cloud Dataproc) in line with project needs. Build and maintain robust monitoring, logging, and alerting systems for client … in languages like Python, Bash, or Go to automate tasks and build necessary tools. Expertise in designing and optimising data pipelines using frameworks like Apache Airflow or equivalent. Demonstrated experience with real-time and batch data processing frameworks, including Apache Kafka, ApacheSpark, or Google Cloud More ❯
Responsibilities: Design and develop scalable, high-performance data processing applications using Scala. Build and optimize ETL pipelines for handling large-scale datasets. Work with ApacheSpark, Kafka, Flink, and other distributed data frameworks to process massive amounts of data. Develop and maintain data lake and warehouse solutions using … technologies like Databricks Delta Lake, Apache Iceberg, or Apache Hudi. Write clean, maintainable, and well-documented code. Optimize query performance, indexing strategies, and storage formats (JSON, Parquet, Avro, ORC). Implement real-time streaming solutions and event-driven architectures. Collaborate with data scientists, analysts, and DevOps engineers to … Master’s degree in Computer Science Engineering, or a related field. 10+ years of experience in software development with Scala Hands-on experience with ApacheSpark (batch & streaming) using Scala. Experience in developing and maintaining data lake and warehouses using technologies like Databricks Delta Lake, Apache Iceberg More ❯
transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or … scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines, version control systems like Git, and … containerization (e.g., Docker). Experience with ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Strong understanding of data governance and best practices in data management. Experience with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving More ❯
transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or … scripting. Strong understanding of data modelling concepts and techniques, including relational and dimensional modelling. Experience in big data technologies and frameworks such as Databricks, Spark, Kafka, and Flink. Experience in using modern data architectures, such as lakehouse. Experience with CI/CD pipelines and version control systems like Git. … Knowledge of ETL tools and technologies such as Apache Airflow, Informatica, or Talend. Knowledge of data governance and best practices in data management. Familiarity with cloud platforms and services such as AWS, Azure, or GCP for deploying and managing data solutions. Strong problem-solving and analytical skills with the More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
Government client on a contract basis. This role requires a deep understanding of data engineering best practices, strong hands-on experience with AWS, Azure, ApacheSpark, data warehousing, database modelling and SQL. You’ll play a critical role in designing, building, and maintaining our data infrastructure to support … high-performance data pipelines and analytics platforms. Responsibilities will include: Design, build, and maintain robust, scalable, and secure data pipelines using AWS services and Apache Spark. Develop and optimize data models for reporting and analytics in Redshift and other DWH platforms. Collaborate with Data Scientists, Analysts, and Business Stakeholders … database technologies including Oracle, Postgres and MSSQLServer; Strong expertise in AWS services including AWS DMS, S3, Lambda, Glue, EMR, Redshift, and IAM. Proficient in ApacheSpark (batch and/or streaming) and big data processing. Solid experience with SQL and performance tuning in data warehouse environments. Hands-on More ❯
Manager - Life Sciences at RED Global UK based On-site: 2+ days a week Key Responsibilities: Orchestrate data workflows: Utilize workflow management tools like Apache Airflow (or similar) to schedule, monitor, and manage the execution of data pipelines. Manage Cloud Infrastructure: Leverage cloud platforms like AWS and Azure to … Python programming skills: Experience writing and debugging complex Python code, including experience with libraries like Pandas, PySpark, and related data science libraries. Experience with ApacheSpark and Databricks: Deep understanding of ApacheSpark principles and experience with Databricks notebooks, clusters, and workspace management. Orchestration tools expertise … Strong experience with workflow management tools like Apache Airflow, Prefect, or similar. This includes designing and implementing complex DAGs (Directed Acyclic Graphs) for pipeline orchestration. Cloud platform experience: Hands-on experience with AWS or Azure , including services related to data processing, storage, and compute. Infrastructure as Code (IaC): Familiarity More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one. Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals. Familiarity with CI/CD for production deployments. Working knowledge of MLOps. Design and deployment … data, analytics, and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake, and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one. Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals. Familiarity with CI/CD for production deployments. Working knowledge of MLOps. Design and deployment … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook. Benefits At Databricks, we strive to provide comprehensive More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals Familiarity with CI/CD for production deployments Working knowledge of MLOps Design and deployment … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals Familiarity with CI/CD for production deployments Working knowledge of MLOps Design and deployment … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook. Benefits At Databricks, we strive to provide comprehensive More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯
data pipelines to serve the easyJet analyst and data science community. Highly competent hands-on experience with relevant Data Engineering technologies, such as Databricks, Spark, Spark API, Python, SQL Server, Scala. Work with data scientists, machine learning engineers and DevOps engineers to develop, develop and deploy machine learning … development experience with Terraform or CloudFormation. Understanding of ML development workflow and knowledge of when and how to use dedicated hardware. Significant experience with ApacheSpark or any other distributed data programming frameworks (e.g. Flink, Hadoop, Beam) Familiarity with Databricks as a data and AI platform or the … data privacy, handling of sensitive data (e.g. GDPR) Experience in event-driven architecture, ingesting data in real time in a commercial production environment with Spark Streaming, Kafka, DLT or Beam. Understanding of the challenges faced in the design and development of a streaming data pipeline and the different options More ❯