a team environment PREFERRED QUALIFICATIONS - Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions - Familiarity with big data technologies (Hadoop, Spark, etc.) - Knowledge of data security and privacy best practices - Strong problem-solving and analytical skills - Excellent written and verbal communication skills Our inclusive culture empowers Amazonians to deliver the More ❯
means to solve challenges. Proficiency in a programming language (e.g., Scala, Python, Java, C#) with understanding of domain modelling and application development. Knowledge of data management platforms (SQL, NoSQL, Spark/Databricks). Experience with modern engineering tools (Git, CI/CD), cloud platforms (Azure, AWS), and Infrastructure as Code (Terraform, Pulumi). Familiarity with various frameworks across front More ❯
ongoing operations of scalable, performant data warehouse (Redshift) tables, data pipelines, reports and dashboards. Development of moderately to highly complex data processing jobs using appropriate technologies (e.g. SQL, Python, Spark, AWS Lambda, etc.) Development of dashboards and reports. Collaborating with stakeholders to understand business domains, requirements, and expectations. Additionally, working with owners of data source systems to understand capabilities More ❯
Kubernetes). Experience working in environments with AI/ML components or interest in learning data workflows for ML applications . Bonus if you have e xposure to Kafka, Spark, or Flink . Experience with data compliance regulations (GDPR). What you can expect from us: Opportunity for annual bonuses Medical Insurance Cycle to work scheme Work from home More ❯
Out in Science, Technology, Engineering, and Mathematics
of the challenges of dealing with large data sets, both structured and unstructured Used a range of open source frameworks and development tools, e.g. NumPy/SciPy/Pandas, Spark, Kafka, Flink Working knowledge of one or more relevant database technologies, e.g. Oracle, Postgres, MongoDB, ArcticDB. Proficient on Linux Advantageous: An excellent understanding of financial markets and instruments An More ❯
technologies are noteworthy to mention and might be seen as bonus: AWS (and Data related proprietary technologies), Azure (and Data related proprietary technologies), Adverity, Fivetran, Looker, PBI, Tableau, RDBMS, Spark, Redis, Kafka; Being a fair, kind and reliable colleague, open to bi-directional criticism and feedback loop, autonomous but equally responsible; Working proficiency in the English language; Master's More ❯
and well-tested solutions to automate data ingestion, transformation, and orchestration across systems. Own data operations infrastructure: Manage and optimise key data infrastructure components within AWS, including Amazon Redshift, Apache Airflow for workflow orchestration and other analytical tools. You will be responsible for ensuring the performance, reliability, and scalability of these systems to meet the growing demands of data … pipelines , data warehouses , and leveraging AWS data services . Strong proficiency in DataOps methodologies and tools, including experience with CI/CD pipelines, containerized applications , and workflow orchestration using Apache Airflow . Familiar with ETL frameworks, and bonus experience with Big Data processing (Spark, Hive, Trino), and data streaming. Proven track record - You've made a demonstrable impact More ❯
Ability to manage complex systems and troubleshoot production issues effectively. Experience working in an agile, cross-functional team environment. Nice to Have: Experience with big data tools such as ApacheSpark, Kafka, or other data processing frameworks or platforms like Databricks, Snowflake. Knowledge of data governance , data security practices, and best practices for managing large data sets that More ❯
of data within the organisation while working with advanced cloud technologies. Key Responsibilities and Deliverables Design, develop, and optimise end-to-end data pipelines (batch & streaming) using Azure Databricks, Spark, and Delta Lake. Implement Medallion Architecture to structure raw, enriched, and curated data layers efficiently. Build scalable ETL/ELT processes with Azure Data Factory and PySpark. Support data … data pipelines. Collaborate with analysts to validate and refine datasets for reporting. Apply DevOps and CI/CD best practices (Git, Azure DevOps) for automated testing and deployment. Optimise Spark jobs, Delta Lake tables, and SQL queries for performance and cost-effectiveness. Troubleshoot and proactively resolve data pipeline issues. Partner with data architects, analysts, and business teams to deliver More ❯
AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions - Experience building large-scale, high-throughput, 24x7 data systems - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience providing technical leadership and mentoring other engineers for best practices on data engineering Our inclusive culture empowers Amazonians to deliver the best results for our customers. If More ❯
Bachelor's or Master's degree in Computer Science, Engineering, or relevant experience hands-on with data engineering Strong hands-on knowledge of data platforms and tools, including Databricks, Spark, and SQL Experience designing and implementing data pipelines and ETL processes Good knowledge of ML ops principles and best practices to deploy, monitor and maintain machine learning models in More ❯
You'll Do Work on Veeva Link's next-gen Data Platform Improve our current environment with features, refactoring, and innovation Work with JVM-based languages or Python on Spark-based data pipelines Operate ML models in close cooperation with our data science team Experiment in your domain to improve precision, recall, or cost savings Requirements Expert skills in … Java or Python Experience with ApacheSpark or PySpark Experience writing software for the cloud (AWS or GCP) Speaking and writing in English enables you to take part in day-to-day conversations in the team and contribute to deep technical discussions Nice to Have Experience with operating machine learning models (e.g., MLFlow) Experience with Data Lakes, Lakehouses More ❯
Proficiency in one or more programming languages including Java, Python, Scala or Golang. Experience with columnar, analytical cloud data warehouses (e.g., BigQuery, Snowflake, Redshift) and data processing frameworks like ApacheSpark is essential. Experience with cloud platforms like AWS, Azure, or Google Cloud. Strong proficiency in designing, developing, and deploying microservices architecture, with a deep understanding of inter More ❯
Tech stack Python (pandas, NumPy, scikit-learn, PyTorch/TensorFlow) SQL (Redshift, Snowflake or similar) AWS SageMaker → Azure ML migration, with Docker, Git, Terraform, Airflow/ADF Optional extras: Spark, Databricks, Kubernetes. What you'll bring 3-5+ years building optimisation or recommendation systems at scale. Strong grasp of mathematical optimisation (e.g., linear/integer programming, meta-heuristics … Production mindset: containerise models, deploy via Airflow/ADF, monitor drift, automate retraining. Soft skills: clear comms, concise docs, and a collaborative approach with DS, Eng & Product. Bonus extras: Spark/Databricks, Kubernetes, big-data panel or ad-tech experience. More ❯
Tech stack Python (pandas, NumPy, scikit-learn, PyTorch/TensorFlow) SQL (Redshift, Snowflake or similar) AWS SageMaker → Azure ML migration, with Docker, Git, Terraform, Airflow/ADF Optional extras: Spark, Databricks, Kubernetes. What you'll bring 3-5+ years building optimisation or recommendation systems at scale. Strong grasp of mathematical optimisation (e.g., linear/integer programming, meta-heuristics … Production mindset: containerise models, deploy via Airflow/ADF, monitor drift, automate retraining. Soft skills: clear comms, concise docs, and a collaborative approach with DS, Eng & Product. Bonus extras: Spark/Databricks, Kubernetes, big-data panel or ad-tech experience. More ❯
Engineers work alongside Machine learning engineers, BI Developers and Data Scientists in cross-functional teams with key impacts and visions. Using your skills with SQL, Python, data modelling and Spark to ingest and transform high volume complex raw event data into user-friendly high impact tables. As a department we strive to give our Data Engineers have high levels … Be deploying applications to the Cloud (AWS) We'd love to hear from you if you Have strong experience with Python & SQL Have experience developing data pipelines using dbt, Spark and Airflow Have experience Data modelling (building optimised and efficient data marts and warehouses in the cloud) Work with Infrastructure as code (Terraform) and containerising applications (Docker) Work with More ❯
scalable data infrastructure, develop machine learning models, and create robust solutions that enhance public service delivery. Working in classified environments, you'll tackle complex challenges using tools like Hadoop, Spark, and modern visualisation frameworks while implementing automation that drives government efficiency. You'll collaborate with stakeholders to transform legacy systems, implement data governance frameworks, and ensure solutions meet the … Collaborative, team-based development; Cloud analytics platforms e.g. relevant AWS and Azure platform services; Data tools hands on experience with Palantir ESSENTIAL; Data science approaches and tooling e.g. Hadoop, Spark; Data engineering approaches; Database management, e.g. MySQL, Postgress; Software development methods and techniques e.g. Agile methods such as SCRUM; Software change management, notably familiarity with git; Public sector best More ❯
KornShell) - Experience with one or more query language (e.g., SQL, PL/SQL, DDL, MDX, HiveQL, SparkSQL, Scala) PREFERRED QUALIFICATIONS - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have More ❯
technical stakeholders • A background in software engineering, MLOps, or data engineering with production ML experience Nice to have: • Familiarity with streaming or event-driven ML architectures (e.g. Kafka, Flink, Spark Structured Streaming) • Experience working in regulated domains such as insurance, finance, or healthcare • Exposure to large language models (LLMs), vector databases, or RAG pipelines • Experience building or managing internal More ❯
of professional experience in data engineering roles, preferably for a customer facing data product Expertise in designing and implementing large-scale data processing systems with data tooling such as Spark, Kafka, Airflow, dbt, Snowflake, Databricks, or similar Strong programming skills in languages such as SQL, Python, Go or Scala Demonstrable use and an understanding of effective use of AI More ❯
Out in Science, Technology, Engineering, and Mathematics
of professional experience in data engineering roles, preferably for a customer facing data product Expertise in designing and implementing large-scale data processing systems with data tooling such as Spark, Kafka, Airflow, dbt, Snowflake, Databricks, or similar Strong programming skills in languages such as SQL, Python, Go or Scala Demonstrable use and an understanding of effective use of AI More ❯
enjoy working with large scale of data. Key job responsibilities • Own the design, development, and maintenance of last mile data sets • Manipulate/mine data from database tables (Redshift, ApacheSpark SQL) • Conduct deep dive investigations into issues related to incorrect and missing data • Identify and adopt best practices in developing data pipelines and tables: data integrity, test … language (e.g., Python, KornShell) - Speak, write, and read fluently in Japanese - Speak, write, and read fluently in English PREFERRED QUALIFICATIONS - Experience with big data technologies such as: Hadoop, Hive, Spark, EMR - Experience with any ETL tool like, Informatica, ODI, SSIS, BODI, Datastage, etc. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have More ❯
the development and adherence to data governance standards. Data-Driven Culture Champion : Advocate for the strategic use of data across the organization. Skills-wise, you'll definitely: Expertise in ApacheSpark Advanced proficiency in Python and Pyspark Extensive experience with Databricks Advanced SQL knowledge Proven leadership abilities in data engineering Strong experience in building and managing CI/ More ❯
systems). Experience with AWS services such as Lambda, SNS, S3, EKS, API Gateway. Knowledge of data warehouse design, ETL/ELT processes, and big data technologies (e.g., Snowflake, Spark). Understanding of data governance and compliance frameworks (e.g., GDPR, HIPAA). Strong communication and stakeholder management skills. Analytical mindset with attention to detail. Leadership and mentoring abilities in … with interface/API data modeling. Knowledge of CI/CD tools like GitHub Actions or similar. AWS certifications such as AWS Certified Data Engineer. Knowledge of Snowflake, SQL, Apache Airflow, and DBT. Familiarity with Atlan for data cataloging and metadata management. Understanding of iceberg tables. Who we are: We're a global business empowering local teams with exciting More ❯
analysis, forecasting and modeling Preferred Qualifications Experience in coding for automation (e.g. Scala, Python, Java etc.) Familiarity with writing data pipelines using one or more big data technologies (e.g. Spark, Hive, Trino, Flink, Airflow etc.) Familiarity with modern cloud technologies such as Kubernetes Familiarity with software development lifecycle for data science and engineering Familiarity with commerce, payments, and user More ❯