data ecosystem. Key Responsibilities: Design, develop, and maintain scalable data pipelines using Databricks and Snowflake . Work with Python libraries such as Pandas, NumPy, PySpark, PyOdbc, PyMsSQL, Requests, Boto3, SimpleSalesforce, and JSON for efficient data processing. Optimize and enhance SQL queries, stored procedures, triggers, and schema designs for RDBMS More ❯
teams to transform raw data into valuable insights, enabling data-driven decision-making. Key Responsibilities: Develop, maintain, and optimize scalable data pipelines using Python, PySpark, and SQL. Work with Spark SQL to process large datasets in distributed environments. Implement ETL processes to extract, transform, and load data from diverse More ❯
Responsibilities Develop and Maintain Data Integration Solutions o Design and implement data integration workflows using AWS GlueEMRAWS MWAAAirflow Lambda Redshift o Demonstrate proficiency in Pyspark Apache Spark and Python for data processing large datasets o Ensure data is accurately and efficiently extracted transformed and loaded into target systems Ensure More ❯
Wakefield, Yorkshire, United Kingdom Hybrid / WFH Options
Flippa.com
/CD) automation, rigorous code reviews, documentation as communication. Preferred Qualifications Familiar with data manipulation and experience with Python libraries like Flask, FastAPI, Pandas, PySpark, PyTorch, to name a few. Proficiency in statistics and/or machine learning libraries like NumPy, matplotlib, seaborn, scikit-learn, etc. Experience in building More ❯
storage, data pipelines to ingest and transform data, and querying & reporting of analytical data. You've worked with technologies such as Python, Spark, SQL, Pyspark, PowerBI etc. You're a problem-solver, pragmatically exploring options and finding effective solutions. An understanding of how to design and build well-structured More ❯
Proven experience of ETL/ELT, including Lakehouse, Pipeline Design, Batch/Stream processing. Strong working knowledge of programming languages, including Python, SQL, PowerShell, PySpark, Spark SQL. Good working knowledge of data warehouse and data mart architectures. Good experience in Data Governance, including Unity Catalog, Metadata Management, Data Lineage More ❯
and discussions around product development Stay up to date on industry latest industry trends and design patterns Qualifications: 6+ years development experience with Spark (PySpark), Python and SQL Extensive knowledge building data pipelines Hands on experience with Databricks Development Strong experience developing on Linux OS Experience with scheduling and More ❯
experience with Data Engineering tools like DataBricks, Delta Lake, and Parquet , and with proficiency in developing and high-performance data pipelines using SQL, Python, PySpark, and Spark Clusters for large-scale data processing Strong expertise in BI & Data Visualization tools, specifically Power BI (dashboard creation, reporting, and analytics) Hands More ❯
code development practices. Knowledge of Apache Spark and similar programming to support streaming data. Experience with some of the following Python libraries: NumPy, Pandas, PySpark, Dask, Apache Airflow, Luigi, SQLAlchemy, Great Expectations, Petl, Boto3, matplotlib, dbutils, koalas, OpenPyXL, XlsxWriter. Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK More ❯
FOSSA. • 3+ years of experience with data engineering tools and technologies, such as, Kubernetes, Container-as-a-Service (CaaS) platforms, OpenShift, DataProc, Spark (with PySpark) or Airflow. • Experience with CI/CD practices and tools, including Tekton or Terraform, as well as containerization technologies like Docker or Kubernetes. • Excellent More ❯
and support architectural decisions as a recognised Databricks expert. Essential Skills & Experience: Demonstrable expertise with Databricks and Apache Spark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth understanding of Lakehouse concepts, medallion architecture, and modern More ❯
of experience in Data Engineering, with a focus on cloud platforms (Azure, AWS, GCP). You have a proven track record working with Databricks (PySpark, SQL, Delta Lake, Unity Catalog). You have extensive experience in ETL/ELT development and data pipeline orchestration (Databricks Workflows, DLT, Airflow, ADF More ❯
years of creating and maintaining conda environments. 4+ years managing containerized environments with OpenShift or Kubernetes Technical Skills: Proficiency in Spark frameworks (Python/PySpark, Scala, or Java) Hands-on experience with OpenShift administration (e.g., cluster setup, networking, storage) Proficiency in creating and maintaining conda environments and dependencies Familiarity More ❯
Azure or AWS) Extensive experience in DBA, schema design & dimensional modelling, and SQL optimization. Programming experience in python or other languages Working proficiency with pySpark (Databricks platform preferred) Excellent written and verbal communication skills, with the ability to collaborate effectively with cross-functional teams. Understanding of good engineering practices More ❯
experience in a Data Engineering role: Passion for data and industry best practices in a dynamic environment. Proficiency in technologies such as Spark/PySpark, Azure Data services, Python or Scala, SQL, testing frameworks, open table formats, CI/CD workflows, and cloud infrastructure management. Excellent communication, analytical, and More ❯
platforms for your clients. Work with us to use big data for good. Qualifications You Have: 3+ years of experience using Python, SQL, and PySpark 3+ years of experience utilizing Databricks or Apache Spark Experience designing and maintaining Data Lakes or Data Lakehouses Experience with big data tools such More ❯
systems and data-driven products working with cross-functional teams. Proficient in Python and experience with common data analytics packages (e.g. Numpy, Pandas, Sklearn, PySpark). Proficient in SQL. Experience serving containerized solutions in the cloud. Good communication skills and the ability to understand and synthesize requirements across multiple More ❯
Delta Lake/Databricks), PL/SQL, Java/J2EE, React, CI/CD pipeline, and release management. Strong experience in Python, Scala/PySpark, PERL/scripting. Experience as a Data Engineer for Cloud Data Lake activities, especially in high-volume data processing frameworks, ETL development using distributed More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
Low Carbon Contracts Company
in a highly numerate subject is essential Minimum 2 years' experience in Python development, including scientific computing and data science libraries (NumPy, pandas, SciPy, PySpark) Solid understanding of object-oriented software engineering design principles for usability, maintainability and extensibility Experience working with Git in a version-controlled environment Good More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
ADLIB Recruitment
experience as a Senior Data Engineer, with some experience mentoring others Excellent Python and SQL skills, with hands-on experience building pipelines in Spark (PySpark preferred) Experience with cloud platforms (AWS/Azure) Solid understanding of data architecture, modelling, and ETL/ELT pipelines Experience using tools like Databricks More ❯
bristol, south west england, United Kingdom Hybrid / WFH Options
ADLIB Recruitment | B Corp™
experience as a Senior Data Engineer, with some experience mentoring others Excellent Python and SQL skills, with hands-on experience building pipelines in Spark (PySpark preferred) Experience with cloud platforms (AWS/Azure) Solid understanding of data architecture, modelling, and ETL/ELT pipelines Experience using tools like Databricks More ❯
Coalville, Leicestershire, East Midlands, United Kingdom Hybrid / WFH Options
Ibstock PLC
Knowledge, Skills and Experience: Essentia l Strong expertise in Databricks and Apache Spark for data engineering and analytics. Proficient in SQL and Python/PySpark for data transformation and analysis. Experience in data lakehouse development and Delta Lake optimisation. Experience with ETL/ELT processes for integrating diverse data More ❯
such as Apache Airflow Strong analytical skills with proven ability to analyze and optimize data models Nice to Have: Experience with ETL tools like PySpark, dbt, Azure Data Factory, or Fivetran Familiarity with BI tools such as Tableau Strong experience with enterprise data integration methodologies and tools Compensation More ❯
to Octopus offices across Europe and the US. Our Data Stack: SQL-based pipelines built with dbt on Databricks Analysis via Python Jupyter notebooks Pyspark in Databricks workflows for heavy lifting Streamlit and Python for dashboarding Airflow DAGs with Python for ETL running on Kubernetes and Docker Django for More ❯
a highly numerate subject is essential At least 2 years of Python development experience, including scientific computing and data science libraries (NumPy, pandas, SciPy, PySpark) Strong understanding of object-oriented design principles for usability and maintainability Experience with Git in a version-controlled environment Knowledge of parallel computing techniques More ❯