Kubernetes). Experience working in environments with AI/ML components or interest in learning data workflows for ML applications . Bonus if you have e xposure to Kafka, Spark, or Flink . Experience with data compliance regulations (GDPR). What you can expect from us: Opportunity for annual bonuses Medical Insurance Cycle to work scheme Work from home More ❯
technologies are noteworthy to mention and might be seen as bonus: AWS (and Data related proprietary technologies), Azure (and Data related proprietary technologies), Adverity, Fivetran, Looker, PBI, Tableau, RDBMS, Spark, Redis, Kafka; Being a fair, kind and reliable colleague, open to bi-directional criticism and feedback loop, autonomous but equally responsible; Working proficiency in the English language; Master's More ❯
and well-tested solutions to automate data ingestion, transformation, and orchestration across systems. Own data operations infrastructure: Manage and optimise key data infrastructure components within AWS, including Amazon Redshift, Apache Airflow for workflow orchestration and other analytical tools. You will be responsible for ensuring the performance, reliability, and scalability of these systems to meet the growing demands of data … pipelines , data warehouses , and leveraging AWS data services . Strong proficiency in DataOps methodologies and tools, including experience with CI/CD pipelines, containerized applications , and workflow orchestration using Apache Airflow . Familiar with ETL frameworks, and bonus experience with Big Data processing (Spark, Hive, Trino), and data streaming. Proven track record - You've made a demonstrable impact More ❯
Ability to manage complex systems and troubleshoot production issues effectively. Experience working in an agile, cross-functional team environment. Nice to Have: Experience with big data tools such as ApacheSpark, Kafka, or other data processing frameworks or platforms like Databricks, Snowflake. Knowledge of data governance , data security practices, and best practices for managing large data sets that More ❯
of data within the organisation while working with advanced cloud technologies. Key Responsibilities and Deliverables Design, develop, and optimise end-to-end data pipelines (batch & streaming) using Azure Databricks, Spark, and Delta Lake. Implement Medallion Architecture to structure raw, enriched, and curated data layers efficiently. Build scalable ETL/ELT processes with Azure Data Factory and PySpark. Support data … data pipelines. Collaborate with analysts to validate and refine datasets for reporting. Apply DevOps and CI/CD best practices (Git, Azure DevOps) for automated testing and deployment. Optimise Spark jobs, Delta Lake tables, and SQL queries for performance and cost-effectiveness. Troubleshoot and proactively resolve data pipeline issues. Partner with data architects, analysts, and business teams to deliver More ❯
scalable data infrastructure, develop machine learning models, and create robust solutions that enhance public service delivery. Working in classified environments, you'll tackle complex challenges using tools like Hadoop, Spark, and modern visualisation frameworks while implementing automation that drives government efficiency. You'll collaborate with stakeholders to transform legacy systems, implement data governance frameworks, and ensure solutions meet the … Collaborative, team-based development; Cloud analytics platforms e.g. relevant AWS and Azure platform services; Data tools hands on experience with Palantir ESSENTIAL; Data science approaches and tooling e.g. Hadoop, Spark; Software development methods and techniques e.g. Agile methods such as SCRUM; Software change management, notably familiarity with git; Public sector best practice guidance, e.g. ITIL, OGC toolkit. Additional Requirements More ❯
data governance, security standards, and compliance practices. Strong understanding of metadata management, data lineage, and data quality frameworks. Preferred Skills & Knowledge: Familiarity with big data technologies such as Hadoop, Spark, or Kafka Excellent communication skills with the ability to explain complex data strategies to non-technical stakeholders. Outstanding problem-solving abilities and organizational skills. Certifications (Preferred/Desirable): Azure More ❯
AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions - Experience building large-scale, high-throughput, 24x7 data systems - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience providing technical leadership and mentoring other engineers for best practices on data engineering Our inclusive culture empowers Amazonians to deliver the best results for our customers. If More ❯
KornShell) - Experience with one or more query language (e.g., SQL, PL/SQL, DDL, MDX, HiveQL, SparkSQL, Scala) PREFERRED QUALIFICATIONS - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have More ❯
Kubernetes). Experience working in environments with AI/ML components or interest in learning data workflows for ML applications . Bonus if you have e xposure to Kafka, Spark, or Flink . Experience with data compliance regulations (GDPR). What you can expect from us: Salary 65-75k Opportunity for annual bonuses Medical Insurance Cycle to work More ❯
Out in Science, Technology, Engineering, and Mathematics
of the challenges of dealing with large data sets, both structured and unstructured Used a range of open source frameworks and development tools, e.g. NumPy/SciPy/Pandas, Spark, Kafka, Flink Working knowledge of one or more relevant database technologies, e.g. Oracle, Postgres, MongoDB, ArcticDB. Proficient on Linux Advantageous: An excellent understanding of financial markets and instruments An More ❯
You'll Do Work on Veeva Link's next-gen Data Platform Improve our current environment with features, refactoring, and innovation Work with JVM-based languages or Python on Spark-based data pipelines Operate ML models in close cooperation with our data science team Experiment in your domain to improve precision, recall, or cost savings Requirements Expert skills in … Java or Python Experience with ApacheSpark or PySpark Experience writing software for the cloud (AWS or GCP) Speaking and writing in English enables you to take part in day-to-day conversations in the team and contribute to deep technical discussions Nice to Have Experience with operating machine learning models (e.g., MLFlow) Experience with Data Lakes, Lakehouses More ❯
and reliability across our platform. Working format: full-time, remote. Schedule: Monday to Friday (the working day is 8+1 hours). Responsibilities: Design, develop, and maintain data pipelines using Apache Airflow . Create and support data storage systems (Data Lakes/Data Warehouses) based on AWS (S3, Redshift, Glue, Athena, etc.). Integrate data from various sources, including mobile … attribution, retention, LTV , and other mobile metrics . Ability to collect and aggregate user data from mobile sources for analytics. Experience building real-time data pipelines (e.g., Kinesis, Kafka, Spark Streaming ). Hands-on CI/CD experience with GitHub . Startup or small team experience - the ability to quickly switch between tasks , suggest lean architectural solutions , make independent More ❯
Proficiency in one or more programming languages including Java, Python, Scala or Golang. Experience with columnar, analytical cloud data warehouses (e.g., BigQuery, Snowflake, Redshift) and data processing frameworks like ApacheSpark is essential. Experience with cloud platforms like AWS, Azure, or Google Cloud. Strong proficiency in designing, developing, and deploying microservices architecture, with a deep understanding of inter More ❯
Works with architects to decompose a solution down into Epics. • Leads the team and inspiring them to advance their skills on technologies currently used or not used ie. Python, Spark, Databricks • Takes initiative to develop knowledge in technology products and tools through on the job learning, certifications and projects. • Advances tools and applications by producing expert code and reviews … Python, Snowflake • Strong SQL query writing skills and excellent understanding of SQL query performance optimization • Very Good knowledge of Agile and SDLC processes • Strong experience of streaming architecture, preferably Apache Spark. • Knowledge of cloud concepts (Azure), data warehouse and services • Able to demonstrate very good analytical and problem-solving skills. • Sound written and verbal communication skills and ability to More ❯
Tech stack Python (pandas, NumPy, scikit-learn, PyTorch/TensorFlow) SQL (Redshift, Snowflake or similar) AWS SageMaker → Azure ML migration, with Docker, Git, Terraform, Airflow/ADF Optional extras: Spark, Databricks, Kubernetes. What you'll bring 3-5+ years building optimisation or recommendation systems at scale. Strong grasp of mathematical optimisation (e.g., linear/integer programming, meta-heuristics … Production mindset: containerise models, deploy via Airflow/ADF, monitor drift, automate retraining. Soft skills: clear comms, concise docs, and a collaborative approach with DS, Eng & Product. Bonus extras: Spark/Databricks, Kubernetes, big-data panel or ad-tech experience. More ❯
Tech stack Python (pandas, NumPy, scikit-learn, PyTorch/TensorFlow) SQL (Redshift, Snowflake or similar) AWS SageMaker → Azure ML migration, with Docker, Git, Terraform, Airflow/ADF Optional extras: Spark, Databricks, Kubernetes. What you'll bring 3-5+ years building optimisation or recommendation systems at scale. Strong grasp of mathematical optimisation (e.g., linear/integer programming, meta-heuristics … Production mindset: containerise models, deploy via Airflow/ADF, monitor drift, automate retraining. Soft skills: clear comms, concise docs, and a collaborative approach with DS, Eng & Product. Bonus extras: Spark/Databricks, Kubernetes, big-data panel or ad-tech experience. More ❯
Engineers work alongside Machine learning engineers, BI Developers and Data Scientists in cross-functional teams with key impacts and visions. Using your skills with SQL, Python, data modelling and Spark to ingest and transform high volume complex raw event data into user-friendly high impact tables. As a department we strive to give our Data Engineers have high levels … Be deploying applications to the Cloud (AWS) We'd love to hear from you if you Have strong experience with Python & SQL Have experience developing data pipelines using dbt, Spark and Airflow Have experience Data modelling (building optimised and efficient data marts and warehouses in the cloud) Work with Infrastructure as code (Terraform) and containerising applications (Docker) Work with More ❯
technical stakeholders • A background in software engineering, MLOps, or data engineering with production ML experience Nice to have: • Familiarity with streaming or event-driven ML architectures (e.g. Kafka, Flink, Spark Structured Streaming) • Experience working in regulated domains such as insurance, finance, or healthcare • Exposure to large language models (LLMs), vector databases, or RAG pipelines • Experience building or managing internal More ❯
Bachelor's or Master's degree in Computer Science, Engineering, or relevant experience hands-on with data engineering Strong hands-on knowledge of data platforms and tools, including Databricks, Spark, and SQL Experience designing and implementing data pipelines and ETL processes Good knowledge of ML ops principles and best practices to deploy, monitor and maintain machine learning models in More ❯
of professional experience in data engineering roles, preferably for a customer facing data product Expertise in designing and implementing large-scale data processing systems with data tooling such as Spark, Kafka, Airflow, dbt, Snowflake, Databricks, or similar Strong programming skills in languages such as SQL, Python, Go or Scala Demonstrable use and an understanding of effective use of AI More ❯
Azure, AWS, GCP) Hands-on experience with SQL, Data Pipelines, Data Orchestration and Integration Tools Experience in data platforms on premises/cloud using technologies such as: Hadoop, Kafka, ApacheSpark, Apache Flink, object, relational and NoSQL data stores. Hands-on experience with big data application development and cloud data warehousing (e.g. Hadoop, Spark, Redshift, Snowflake More ❯
the development and adherence to data governance standards. Data-Driven Culture Champion : Advocate for the strategic use of data across the organization. Skills-wise, you'll definitely: Expertise in ApacheSpark Advanced proficiency in Python and Pyspark Extensive experience with Databricks Advanced SQL knowledge Proven leadership abilities in data engineering Strong experience in building and managing CI/ More ❯
systems). Experience with AWS services such as Lambda, SNS, S3, EKS, API Gateway. Knowledge of data warehouse design, ETL/ELT processes, and big data technologies (e.g., Snowflake, Spark). Understanding of data governance and compliance frameworks (e.g., GDPR, HIPAA). Strong communication and stakeholder management skills. Analytical mindset with attention to detail. Leadership and mentoring abilities in … with interface/API data modeling. Knowledge of CI/CD tools like GitHub Actions or similar. AWS certifications such as AWS Certified Data Engineer. Knowledge of Snowflake, SQL, Apache Airflow, and DBT. Familiarity with Atlan for data cataloging and metadata management. Understanding of iceberg tables. Who we are: We're a global business empowering local teams with exciting More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
paced, product-focused environment* Proactive, curious, and capable of balancing technical depth with business understanding* Excellent communication skills and a collaborative mindset Tech Stack/Tools Python SQL Dbt Spark or Databricks GCP (Open to AWS/Azure) CI/CD tooling Benefits * Company profit share scheme* Bupa private healthcare with 24/7 GP access* Up to More ❯