two of the following: Python, SQL, Java Commercial experience in client-facing projects is a plus, especially within multi-disciplinary teams Deep knowledge of database technologies: Distributed systems (e.g., Spark, Hadoop, EMR) RDBMS (e.g., SQL Server, Oracle, PostgreSQL, MySQL) NoSQL (e.g., MongoDB, Cassandra, DynamoDB, Neo4j) Solid understanding of software engineering best practices - code reviews, testing frameworks, CI/CD More ❯
and managing machine learning models and infrastructure. Data Management Knowledge: Understanding of data management principles, including experience with databases (SQL and NoSQL) and familiarity with big data frameworks like ApacheSpark or Hadoop. Knowledge of data ingestion, storage, and management is essential. Monitoring and Logging Tools : Experience with monitoring and logging tools to track system performance and model More ❯
Cleared: Required Essential Skills & Experience: 10+ years of experience in data engineering, with at least 3+ years of hands-on experience with Azure Databricks. Strong proficiency in Python and Spark (PySpark) or Scala. Deep understanding of data warehousing principles, data modelling techniques, and data integration patterns. Extensive experience with Azure data services, including Azure Data Factory, Azure Blob Storage More ❯
science use-cases across various industries Design and develop feature engineering pipelines, build ML & AI infrastructure, deploy models, and orchestrate advanced analytical insights Write code in SQL, Python, and Spark following software engineering best practices Collaborate with stakeholders and customers to ensure successful project delivery Who we are looking for We are looking for collaborative individuals who want to More ❯
cloud platforms (AWS, GCP, or Azure) • Experience with: o Data warehousing and lake architectures o ETL/ELT pipeline development o SQL and NoSQL databases o Distributed computing frameworks (Spark, Kinesis etc) o Software development best practices including CI/CD, TDD and version control. o Containerisation tools like Docker or Kubernetes o Experience with Infrastructure as Code tools More ❯
SageMaker, GCP AI Platform, Azure ML, or equivalent). Solid understanding of data-engineering concepts: SQL/noSQL, data pipelines (Airflow, Prefect, or similar), and batch/streaming frameworks (Spark, Kafka). Leadership & Communication: Proven ability to lead cross-functional teams in ambiguous startup settings. Exceptional written and verbal communication skillsable to explain complex concepts to both technical and More ❯
The Data Analytics Consultant is an emerging leader with previous Data Analytics experience. The ideal candidate will have experience of design, development and operations using services like Amazon Kinesis, Apache Kafka, ApacheSpark, Amazon SageMaker, Amazon EMR. The Data Analytics Consultant is comfortable rolling up their sleeves to design and code modules for infrastructure, application, and processes. More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
Stott and May
Start: ASAP Duration: 12 months Location: Mostly Remote - must have access to London or Bristol Pay: negotiable, INSIDE IR35 Responsibilities: - Design, implement robust ETL/ELT data pipelines using Apache Airflow - Build ingestion processes from internal systems and APIs, using Kafka, Spark, AWS - Develop and maintain data lakes and warehouses (AWS S3, Redshift) - Ensuring governance using automated testing … manage CI/CD pipelines for data deployments and ensure version control of DAGs - Apply best practice in security and compliance Required Tech Skills: - Python and SQL for processing - Apache Airflow, writing Airflow DAGs and configuring airflow jobs - AWS cloud platform and services like S3, Redshift - Familiarity with big data processing using ApacheSpark - Knowledge of modelling More ❯
of Relational Databases and Data Warehousing concepts. Experience of Enterprise ETL tools such as Informatica, Talend, Datastage or Alteryx. Project experience using the any of the following technologies: Hadoop, Spark, Scala, Oracle, Pega, Salesforce. Cross and multi-platform experience. Team building and leading. You must be: Willing to work on client sites, potentially for extended periods. Willing to travel More ❯
Maths or similar Science or Engineering discipline Strong Python and other programming skills (Java and/or Scala desirable) Strong SQL background Some exposure to big data technologies (Hadoop, spark, presto, etc.) NICE TO HAVES OR EXCITED TO LEARN: Some experience designing, building and maintaining SQL databases (and/or NoSQL) Some experience with designing efficient physical data models More ❯
in data engineering, architecture, or platform management roles, with 5+ years in leadership positions. Expertise in modern data platforms (e.g., Azure, AWS, Google Cloud) and big data technologies (e.g., Spark, Kafka, Hadoop). Strong knowledge of data governance frameworks, regulatory compliance (e.g., GDPR, CCPA), and data security best practices. Proven experience in enterprise-level architecture design and implementation. Hands More ❯
able to work across full data cycle. • Proven Experience working with AWS data technologies (S3, Redshift, Glue, Lambda, Lake formation, Cloud Formation), GitHub, CI/CD • Coding experience in ApacheSpark, Iceberg or Python (Pandas) • Experience in change and release management. • Experience in Database Warehouse design and data modelling • Experience managing Data Migration projects. • Cloud data platform development … the AWS services like Redshift, Lambda,S3,Step Functions, Batch, Cloud formation, Lake Formation, Code Build, CI/CD, GitHub, IAM, SQS, SNS, Aurora DB • Good experience with DBT, Apache Iceberg, Docker, Microsoft BI stack (nice to have) • Experience in data warehouse design (Kimball and lake house, medallion and data vault) is a definite preference as is knowledge of … other data tools and programming languages such as Python & Spark and Strong SQL experience. • Experience is building Data lake and building CI/CD data pipelines • A candidate is expected to understand and can demonstrate experience across the delivery lifecycle and understand both Agile and Waterfall methods and when to apply these. Experience: This position requires several years of More ❯
as AWS, Azure, GCP, and Snowflake. Understanding of cloud platform infrastructure and its impact on data architecture. Data Technology Skills: A solid understanding of big data technologies such as ApacheSpark, and knowledge of Hadoop ecosystems. Knowledge of programming languages such as Python, R, or Java is beneficial. Exposure to ETL/ELT processes, SQL, NoSQL databases is More ❯
Programming Mastery: Advanced skills in Python or another major language; writing clean, testable, production-grade ETL code at scale. Modern Data Pipelines: Experience with batch and streaming frameworks (e.g., ApacheSpark, Flink, Kafka Streams, Beam), including orchestration via Airflow, Prefect or Dagster. Data Modeling & Schema Management: Demonstrated expertise in designing, evolving, and documenting schemas (OLAP/OLTP, dimensional More ❯
management and associated tools such as Git/Bitbucket. Experience in the use of CI/CD tools such as Jenkins or an understanding of their role. Experience with ApacheSpark or Hadoop. Experience in building data pipelines. Experience of designing warehouses, ETL pipelines and data modelling. Good knowledge in designing, building, using, and maintaining REST APIs. Good More ❯
or OpenCV. Knowledge of ML model serving infrastructure (TensorFlow Serving, TorchServe, MLflow). Knowledge of WebGL, Canvas API, or other graphics programming technologies. Familiarity with big data technologies (Kafka, Spark, Hadoop) and data engineering practices. Background in computer graphics, media processing, or VFX pipeline development. Experience with performance profiling, system monitoring, and observability tools. Understanding of network protocols, security More ❯
architecture principles, including data modeling, data warehousing, data integration, and data governance. Databricks Expertise: They have hands-on experience with the Databricks platform, including its various components such as Spark, Delta Lake, MLflow, and Databricks SQL. They are proficient in using Databricks for various data engineering and data science tasks. Cloud Platform Proficiency: They are familiar with cloud platforms More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
ADLIB Recruitment
systems Clear communicator, able to translate complex data concepts to cross-functional teams Bonus points for experience with: DevOps tools like Docker, Kubernetes, CI/CD Big data tools (Spark, Hadoop), ETL workflows, or high-throughput data streams Genomic data formats and tools Cold and hot storage management, ZFS/RAID systems, or tape storage AI/LLM tools More ❯
Scala) Extensive experience with cloud platforms (AWS, GCP, or Azure) Experience with: Data warehousing and lake architectures ETL/ELT pipeline development SQL and NoSQL databases Distributed computing frameworks (Spark, Kinesis etc) Software development best practices including CI/CD, TDD and version control. Strong understanding of data modelling and system architecture Excellent problem-solving and analytical skills Whilst More ❯
and maintenance of IDBS's software platforms adheres to IDBS's architecture vision. What we'll get you doing: Design, develop, and maintain scalable data pipelines using Databricks and ApacheSpark (PySpark) to support analytics and other data-driven initiatives. Support the elaboration of requirements, formulation of the technical implementation plan and backlog refinement. Provide technical perspective to … products enhancements & new requirements activities. Optimize Spark-based workflows for performance, scalability, and data integrity, ensuring alignment with GxP and other regulatory standards. Research, and promote new technologies, design patterns, approaches, tools and methodologies that could optimise and accelerate development. Apply strong software engineering practices including version control (Git), CI/CD pipelines, unit testing, and code reviews to … for research and regulatory teams. Enabled regulatory compliance by implementing secure, auditable, and GxP-aligned data workflows with robust access controls. Improved system performance and cost-efficiency by optimizing Spark jobs and Databricks clusters, leading to measurable reductions in compute costs and processing times. Fostered cross-functional collaboration by building reusable. testable, well-documented Databricks notebooks and APIs that More ❯
concepts to non-technical stakeholders. Team Player: Ability to work effectively in a collaborative team environment, as well as independently. Preferred Qualifications: Familiarity with big data technologies (e.g., Hadoop, Spark, Kafka). Familiarity with AWS and its data services (e.g. S3, Athena, AWS Glue). Familiarity with data warehousing solutions (e.g., Redshift, BigQuery, Snowflake). Knowledge of containerization and … orchestration tools (e.g., Docker, ECS, Kubernetes). Familiarity of data orchestration tools (e.g. Prefect, Apache Airflow). Familiarity with CI/CD pipelines and DevOps practices. Familiarity with Infrastructure-as-code tools (e.g. Terraform, AWS CDK). Employee Benefits: At Intelmatix, our benefits package is designed to meet the diverse needs of our employees, reflecting our dedication to their More ❯
solutions. The team's expertise spans a wide range of technologies, including Java and Python-based MicroServices, AWS/GCP cloud backend systems, Big Data technologies like Hive and Spark, and modern Web applications. With a globally distributed presence across the US, India and Europe, the team thrives on collaboration, bringing together diverse perspectives to solve complex challenges. At … entrepreneurial spirit Excellent verbal and written communication skills BS or MS degree in Computer Science or equivalent Nice to Have Experience in distributed computing frameworks like - Hive/Hadoop, ApacheSpark Experience in developing Finance or HR related applications Experience with following cloud services: AWS Elastic Beanstalk, EC2, S3, CloudFront, RDS, DynamoDB, VPC, Elastic Cache, Lambda Working experience … with Terraform Experience in creating workflows for Apache Airflow Benefits Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare More ❯
solutions. The team's expertise spans a wide range of technologies, including Java and Python-based MicroServices, AWS/GCP cloud backend systems, Big Data technologies like Hive and Spark, and modern Web applications. With a globally distributed presence across the US, India and Europe, the team thrives on collaboration, bringing together diverse perspectives to solve complex challenges. At … entrepreneurial spirit Excellent verbal and written communication skills BS or MS degree in Computer Science or equivalent Nice to Have Experience in distributed computing frameworks like - Hive/Hadoop, ApacheSpark Experience in developing Finance or HR related applications Experience with following cloud services: AWS Elastic Beanstalk, EC2, S3, CloudFront, RDS, DynamoDB, VPC, Elastic Cache, Lambda Working experience … with Terraform Experience in creating workflows for Apache Airflow Benefits Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare More ❯
solutions. The team's expertise spans a wide range of technologies, including Java and Python-based MicroServices, AWS/GCP cloud backend systems, Big Data technologies like Hive and Spark, and modern Web applications. With a globally distributed presence across the US, India and Europe, the team thrives on collaboration, bringing together diverse perspectives to solve complex challenges. At … entrepreneurial spirit Excellent verbal and written communication skills BS or MS degree in Computer Science or equivalent Nice to Have Experience in distributed computing frameworks like - Hive/Hadoop, ApacheSpark Experience in developing Finance or HR related applications Experience with following cloud services: AWS Elastic Beanstalk, EC2, S3, CloudFront, RDS, DynamoDB, VPC, Elastic Cache, Lambda Working experience … with Terraform Experience in creating workflows for Apache Airflow Benefits Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare More ❯
in Computer Science, Data Science, Engineering, or a related field. Strong programming skills in languages such as Python, SQL, or Java. Familiarity with data processing frameworks and tools (e.g., ApacheSpark, Hadoop, Kafka) is a plus. Basic understanding of cloud platforms (e.g., AWS, Azure, Google Cloud) and their data services. Knowledge of database systems (e.g., MySQL, PostgreSQL, MongoDB More ❯