to optimize relevant workflow components for advanced artificial intelligence applications. The Contractor shall lead work to optimize cloud-based computing technologies, such as leveraging pyspark, distributed computation, and model training/inference, and integrate solutions into relevant delivery mechanisms or partner systems. The Contractor shall build tools and scripts More ❯
of GenAI models. Familiarity with prompt engineering and model optimization techniques. Contributions to open-source projects in the MLOps or GenAI space. Familiarity with PySpark for distributed data processing. £45,000 - £57,000 a year We are dedicated to building a diverse, inclusive, and authentic workplace, so if you More ❯
frameworks and data governance practices, with an emphasis on scalability and compliance in research environments. Enterprise exposure to data engineering tools and products (Spark, PySpark, BigQuery, Pub/Sub) with an understanding of product/market fit for internal stakeholders Familiarity with cloud computing environments, including but not limited More ❯
data graphing tool traversal capabilities built upon Apache Gremlin. (Preferred) • Experience building and operating high performance data processing pipelines using Lambda, Step Functions and PySpark on the infrastructure with EMR. (Preferred) • Experience working with enterprise services used for Data Management, including the enterprise catalog service (and associated APIs), and More ❯
frameworks and data governance practices, with an emphasis on scalability and compliance in research environments. Enterprise exposure to data engineering tools and products (Spark, PySpark, BigQuery, Pub/Sub) with an understanding of product/market fit for internal stakeholders Familiarity with cloud computing environments, including but not limited More ❯
GitHub. Understanding of how to build and run containerized applications (Docker, Helm) Familiarity with, or a working understanding of big data search tools (Airflow, Pyspark, Trino, OpenSearch, Elastic, etc. More ❯
requirements and mission outcomes. Requirements Minimum Qualifications: - 3-5 years of professional experience in Python development. - Strong experience with data processing libraries such as PySpark, Pandas, and NumPy. - Proficiency in API development using Python libraries such as FastAPI. - Hands-on experience with unit testing frameworks including PyTest and mocking More ❯
shape and implement Shell's strategy What you bring Have substantial experience in technical and process guidance Experience in Python FastAPI development, Spark/pySpark, Typescript/React,T-SQL/SQL/Azure SQL and other programming frameworks and paradigm Able to mix strategic and pragmatic approaches to More ❯
of both data pipelines and complex ontology development Shall have demonstrable extensive knowledge of distributed computing frameworks (e.g., Spark) and Foundry programming languages (Python, PySpark, SQL, TypeScript) Foundry Data Engineer certification required Minimum Clearance Required to Start: Top Secret SCI This position is part of our Federal Solutions team. More ❯
queries for huge datasets. Has a solid understanding of blockchain ecosystem elements like DeFi, Exchanges, Wallets, Smart Contracts, mixers and privacy services. Databricks and PySpark Analysing blockchain data Building and maintaining data pipelines Deploying machine learning models Use of graph analytics and graph neural networks If this sounds like More ❯
queries for huge datasets. Has a solid understanding of blockchain ecosystem elements like DeFi, Exchanges, Wallets, Smart Contracts, mixers and privacy services. Databricks and PySpark Analysing blockchain data Building and maintaining data pipelines Deploying machine learning models Use of graph analytics and graph neural networks If this sounds like More ❯
essential skills: Typical Data Engineering Experience required (3+ yrs): Strong knowledge and experience: Azure Data Factory and Synapse data solution provision Power BI PythonPySpark (Preference will be given to those who hold relevant certifications) Proficient in SQL. Knowledge of Terraform Ability to develop and deliver complex visualisation, reporting More ❯
in programming languages and data structures such as SAS, Python, R, SQL is key. With Python background, particularly familiarity with pandas/polars/pyspark, pytest; understanding of OOP principles; git version control; knowledge of the following frameworks a plus: pydantic, pandera, sphinx Additionally, experience in any or all More ❯
london, south east england, united kingdom Hybrid / WFH Options
Carnegie Consulting Limited
in programming languages and data structures such as SAS, Python, R, SQL is key. With Python background, particularly familiarity with pandas/polars/pyspark, pytest; understanding of OOP principles; git version control; knowledge of the following frameworks a plus: pydantic, pandera, sphinx Additionally, experience in any or all More ❯
Cardiff, South Glamorgan, Wales, United Kingdom Hybrid / WFH Options
Yolk Recruitment
Azure-based platforms (Synapse, Data Factory, Databricks) Familiarity with data regulations (GDPR, FCA) and SMCR environments Bonus points for experience with Python, R, or PySpark Why You Should Apply: Executive-level visibility and the chance to lead a high-impact transformation Full budget ownership with freedom to shape systems More ❯
requirements. Preferred Skills and Experience Databricks Azure Data Factory Data Lakehouse Medallion architecture Microsoft Azure T-SQL Development (MS SQL Server 2005 onwards) Python, PySpark Experience of the following systems would also be advantageous: Azure DevOps MDS Kimball Dimensional Modelling Methodology Power Bi Unity Catalogue Microsoft Fabric Experience of More ❯
and Willingness to Learn): Proficient in Python and SQL. Familiarity with data manipulation frameworks such as Pandas. Exposure to big data processing concepts (e.g., PySpark) is a plus. Basic understanding of ETL processes and tools (e.g., concepts similar to Dagster). Basic understanding of OLAP database concepts (e.g., concepts More ❯
Experience in CI/CD and automation. Knowledge of Identity and Access Management (IAM), networking, and security. Affinity with programming languages such as Python, PySpark, SQL, and Bash. Experience with containerized solutions such as Docker and Kubernetes. You are currently living in the Netherlands and have a valid work More ❯
on platforms such as AWS, GCP, and Azure. Extensive hands-on experience with cloud-based AI/ML solutions and programming languages (e.g., Python, PySpark), data modelling, and microservices. Proficient in LLM orchestration on platforms such as OpenAI on Azure, AWS Bedrock, GCP Vertex AI, or Gemini AI. Serve More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Capgemini
AI Platforms: Google Cloud Platform, Amazon Web Services, Microsoft Azure, Databricks. Experience in one or more of the listed Languages or Packages: Python, R, Pyspark, Scala, PowerBI, Tableau. Proven experience in successfully delivering multiple complex data rich workstreams in parallel to supporting wider strategic ambitions and supporting others in More ❯
Newbury, Berkshire, United Kingdom Hybrid / WFH Options
Intuita - Vacancies
effectiveness, including Azure DevOps. Considerable experience designing and building operationally efficient pipelines, utilising core Azure components, such as Azure Data Factory, Azure Databricks and Pyspark etc. Proven experience in modelling data through a medallion-based architecture, with curated dimensional models in the gold layer built for analytical use. Strong More ❯
and access controls. Monitor and optimize performance of data workflows using CloudWatch, AWS Step Functions, and performance tuning techniques. Automate data processes using Python, PySpark, SQL, or AWS SDKs. Collaborate with cross-functional teams to support AI/ML, analytics, and business intelligence initiatives. Maintain and enhance CI/… a cloud environment. Required Skills & Qualifications: 5+ years of experience in data engineering with a strong focus on AWS cloud technologies. Proficiency in Python, PySpark, SQL, and AWS Glue for ETL development. Hands-on experience with AWS data services, including Redshift, Athena, Glue, EMR, and Kinesis. Strong knowledge of More ❯
Future Talent Pool - GCP Data Engineer, London, hybrid role - digital Google Cloud transformation programme Proficiency in programming languages such as Python, PySpark and Java develop ETL processes for Data ingestion & preparation SparkSQL CloudRun, DataFlow, CloudStorage GCP BigQuery Google Cloud Platform Data Studio Unix/Linux Platform Version control tools More ❯
with proficiency in designing and implementing CI/CD pipelines in Cloud environments. Excellent practical expertise in Performance tuning and system optimisation. Experience with PySpark and Azure Databricks for distributed data processing and large-scale data analysis. Proven experience with web frameworks , including knowledge of Django and experience with More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Syntax Consultancy Limited
with proficiency in designing and implementing CI/CD pipelines in Cloud environments. Excellent practical expertise in Performance tuning and system optimisation. Experience with PySpark and Azure Databricks for distributed data processing and large-scale data analysis. Proven experience with web frameworks , including knowledge of Django and experience with More ❯