Europe, the UK and the US. ABOUT THE ROLE Sand Technologies focuses on cutting-edge cloud-based data projects, leveraging tools such as Databricks, DBT, Docker, Python, SQL, and PySpark to name a few. We work across a variety of data architectures such as Data Mesh, lakehouse, data vault and data warehouses. Our data engineers create pipelines that support More ❯
data engineering and reporting. Including storage, data pipelines to ingest and transform data, and querying & reporting of analytical data. You've worked with technologies such as Python, Spark, SQL, Pyspark, PowerBI etc. You're a problem-solver, pragmatically exploring options and finding effective solutions. An understanding of how to design and build well-structured, maintainable systems. Strong communication skills More ❯
across the team. Skills & Experience Hands-on experience with Azure Databricks, Delta Lake, Data Factory, and Synapse. Strong understanding of Lakehouse architecture and medallion design patterns. Proficient in Python, PySpark, and SQL, with advanced query optimisation skills. Proven experience building scalable ETL pipelines and managing data transformations. Familiarity with data quality frameworks and monitoring tools. Experience working with Git More ❯
cooperation with our data science team Experiment in your domain to improve precision, recall, or cost savings Requirements Expert skills in Java or Python Experience with Apache Spark or PySpark Experience writing software for the cloud (AWS or GCP) Speaking and writing in English enables you to take part in day-to-day conversations in the team and contribute More ❯
Head of Data Platform and Services, you'll not only maintain and optimize our data infrastructure but also spearhead its evolution. Built predominantly on Databricks, and utilizing technologies like Pyspark and Delta Lake, our infrastructure is designed for scalability, robustness, and efficiency. You'll take charge of developing sophisticated data integrations with various advertising platforms, empowering our teams with … and informed decision-making What you'll be doing for us Leadership in Design and Development : Lead in the architecture, development, and upkeep of our Databricks-based infrastructure, harnessing Pyspark and Delta Lake. CI/CD Pipeline Mastery : Create and manage CI/CD pipelines, ensuring automated deployments and system health monitoring. Advanced Data Integration : Develop sophisticated strategies for … standards. Data-Driven Culture Champion : Advocate for the strategic use of data across the organization. Skills-wise, you'll definitely: Expertise in Apache Spark Advanced proficiency in Python and Pyspark Extensive experience with Databricks Advanced SQL knowledge Proven leadership abilities in data engineering Strong experience in building and managing CI/CD pipelines. Experience in implementing data integrations with More ❯
designing and maintaining large-scale data warehouses and data lakes. Expertise in GCP data services including BigQuery, Composer, Dataform, DataProc, and Pub/Sub. Strong programming experience with Python, PySpark, and SQL. Hands-on experience with data modelling, ETL processes, and data quality frameworks. Proficiency with BI/reporting tools such as Looker or PowerBI. Excellent communication and stakeholder More ❯
experience with Azure services such as Data Factory, Databricks, Synapse (DWH), Azure Functions, and other data analytics tools, including streaming. Experience with Airflow and Kubernetes. Programming skills in Python (PySpark) and scripting languages like Bash. Knowledge of Git, CI/CD operations, and Docker. Basic PowerBI knowledge is a plus. Experience deploying cloud infrastructure is desirable. Understanding of Infrastructure More ❯
AWS Data Engineer London, UK Permanent Strong experience in Python, PySpark, AWS S3, AWS Glue, Databricks, Amazon Redshift, DynamoDB, CI/CD and Terraform. Total 7 + years of experience in Data engineering is required. Design, develop, and optimize ETL pipelines using AWS Glue, Amazon EMR and Kinesis for real-time and batch data processing. Implement data transformation, streaming More ❯
infrastructure Excellent communication and collaboration skills Experience working with Git, practicing code reviews and branching strategies, CI/CD and testing in software solutions Proficiency in SQL, Python, and PySpark Ability to translate marketing needs into well-structured data products Deep understanding of data modeling concepts and building scalable data marts Basic experience with frontend technologies is a plus More ❯
Experience with Agile/Scrum Framework. Excellent problem-solving and analytical skills. Excellent communication skills, both at a deep technical level and stakeholder level. Data Expert experience with Databricks (PySpark). Experience building and maintaining complex ETL Projects, end-to-end (ingestion, processing, storage). Expert knowledge and experience with data modelling, data access, and data storage techniques. Experience More ❯
and root cause analysis. Following agreed architectural standards and contributing to their continuous improvement. What do I need? Proficiency in Azure and its data related services. Strong SQL and PySpark skills, with a focus on writing efficient, readable, modular code. Experience of development on modern cloud data platforms (e.g. Databricks, Snowflake, RedShift). Familiarity of Data Lakehouse principles, standards More ❯
designing, fine-tunning and developing GenAI models and building agent AI systems Our technology stack Python and associated ML/DS libraries (Scikit-learn, Numpy, LightlGBM, Pandas, TensorFlow, etc ) PySpark AWS cloud infrastructure: EMR, ECS, S3, Athena, etc. MLOps: Terraform, Docker, Airflow, MLFlow, Jenkins On call statement: Please be aware that our Machine Learning Engineers are required to be More ❯
designing, fine-tunning and developing GenAI models and building agent AI systems Our technology stack Python and associated ML/DS libraries (Scikit-learn, Numpy, LightlGBM, Pandas, TensorFlow, etc...) PySpark AWS cloud infrastructure: EMR, ECS, S3, Athena, etc. MLOps: Terraform, Docker, Airflow, MLFlow, Jenkins On call statement: Please be aware that our Machine Learning Engineers are required to be More ❯
Design and implement end-to-end data architecture on AWS using tools such as Glue, Lake Formation, and Athena Develop scalable and secure ETL/ELT pipelines using Python, PySpark, and SQL Drive decisions on data modeling, lakehouse architecture, and integration strategies with Databricks and Snowflake Collaborate cross-functionally to embed data governance, quality, and lineage into platform design … Serve as a trusted advisor to engineering and business stakeholders on data strategy and architecture What You Bring: Deep, hands-on expertise with AWS data services (Glue, Lake Formation, PySpark, Athena, etc.) Strong coding skills in Python and SQL for building, testing, and optimizing data pipelines Proven experience designing secure, scalable, and reliable data architectures in cloud environments Solid More ❯
create bespoke, scalable data solutions Support data migration efforts from Azure to Databricks Use Terraform to manage and deploy cloud infrastructure Build robust data workflows in Python (e.g., pandas, PySpark) Ensure the platform is scalable, efficient, and ready for future AI use cases REQUIRED SKILLS & EXPERIENCE Strong experience with Azure and Databricks environments Advanced Python skills for data engineering … pandas, PySpark) Proficiency in designing and maintaining ETL pipelines Experience with Terraform for infrastructure automation Track record of working on cloud migration projects, especially Azure to Databricks Comfortable working onsite in London 2 days/week and engaging cross-functionally Strong communication and problem-solving abilities NICE TO HAVES Experience with Qlik or other data visualisation tools Exposure to More ❯
Design and implement end-to-end data architecture on AWS using tools such as Glue, Lake Formation, and Athena Develop scalable and secure ETL/ELT pipelines using Python, PySpark, and SQL Drive decisions on data modeling, lakehouse architecture, and integration strategies with Databricks and Snowflake Collaborate cross-functionally to embed data governance, quality, and lineage into platform design … Serve as a trusted advisor to engineering and business stakeholders on data strategy and architecture What You Bring: Deep, hands-on expertise with AWS data services (Glue, Lake Formation, PySpark, Athena, etc.) Strong coding skills in Python and SQL for building, testing, and optimizing data pipelines Proven experience designing secure, scalable, and reliable data architectures in cloud environments Solid More ❯
practice Essential Experience: Proven expertise in building data warehouses and ensuring data quality on GCP Strong hands-on experience with BigQuery, Dataproc, Dataform, Composer, Pub/Sub Skilled in PySpark, Python and SQL Solid understanding of ETL/ELT processes Clear communication skills and ability to document processes effectively Desirable Skills: GCP Professional Data Engineer certification Exposure to Agentic More ❯
practice Essential Experience: Proven expertise in building data warehouses and ensuring data quality on GCP Strong hands-on experience with BigQuery, Dataproc, Dataform, Composer, Pub/Sub Skilled in PySpark, Python and SQL Solid understanding of ETL/ELT processes Clear communication skills and ability to document processes effectively Desirable Skills: GCP Professional Data Engineer certification Exposure to Agentic More ❯
on experience with the Azure Data Stack, critically ADF and Synapse (experience with Microsoft Fabric is a plus) Highly developed python and data pipeline development knowledge, must include substantial PySpark experience Demonstrable DevOps and DataOps experience with an understanding of best practices for engineering, test and ongoing service delivery An understanding of Infrastructure as Code concepts (Demonstrable Terraform experience More ❯
data modeling, and software architecture Data Science Library Knowledge: Deep understanding of key Data Science and Machine Learning libraries (e.g., pandas, NumPy, scikit-learn, TensorFlow), with a preference for PySpark experience Model Productionisation: Experience in taking Machine Learning models from development to production CI/CD and MLOps Experience : Familiarity with Continuous Integration and Continuous Deployment pipelines, especially in More ❯