performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like ApacheSpark , Databricks , Apache Pulsar , Apache Airflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and … or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with ApacheSpark , Databricks , or … similar big data platforms for processing large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like Apache Airflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
INFUSED SOLUTIONS LIMITED
culture. Key Responsibilities Design, build, and maintain scalable data solutions to support business objectives. Work with Microsoft Fabric to develop robust data pipelines. Utilise ApacheSpark and the Spark API to handle large-scale data processing. Contribute to data strategy, governance, and architecture best practices. Identify and … approaches. Collaborate with cross-functional teams to deliver projects on time . Key Requirements ? Hands-on experience with Microsoft Fabric . ? Strong expertise in ApacheSpark and Spark API . ? Knowledge of data architecture, engineering best practices, and governance . ? DP-600 & DP-700 certifications are highly More ❯
bolton, greater manchester, north west england, United Kingdom Hybrid / WFH Options
Infused Solutions
culture. Key Responsibilities Design, build, and maintain scalable data solutions to support business objectives. Work with Microsoft Fabric to develop robust data pipelines. Utilise ApacheSpark and the Spark API to handle large-scale data processing. Contribute to data strategy, governance, and architecture best practices. Identify and … approaches. Collaborate with cross-functional teams to deliver projects on time . Key Requirements ✅ Hands-on experience with Microsoft Fabric . ✅ Strong expertise in ApacheSpark and Spark API . ✅ Knowledge of data architecture, engineering best practices, and governance . ✅ DP-600 & DP-700 certifications are highly More ❯
warrington, cheshire, north west england, United Kingdom Hybrid / WFH Options
Infused Solutions
culture. Key Responsibilities Design, build, and maintain scalable data solutions to support business objectives. Work with Microsoft Fabric to develop robust data pipelines. Utilise ApacheSpark and the Spark API to handle large-scale data processing. Contribute to data strategy, governance, and architecture best practices. Identify and … approaches. Collaborate with cross-functional teams to deliver projects on time . Key Requirements ✅ Hands-on experience with Microsoft Fabric . ✅ Strong expertise in ApacheSpark and Spark API . ✅ Knowledge of data architecture, engineering best practices, and governance . ✅ DP-600 & DP-700 certifications are highly More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
Government client on a contract basis. This role requires a deep understanding of data engineering best practices, strong hands-on experience with AWS, Azure, ApacheSpark, data warehousing, database modelling and SQL. You’ll play a critical role in designing, building, and maintaining our data infrastructure to support … high-performance data pipelines and analytics platforms. Responsibilities will include: Design, build, and maintain robust, scalable, and secure data pipelines using AWS services and Apache Spark. Develop and optimize data models for reporting and analytics in Redshift and other DWH platforms. Collaborate with Data Scientists, Analysts, and Business Stakeholders … database technologies including Oracle, Postgres and MSSQLServer; Strong expertise in AWS services including AWS DMS, S3, Lambda, Glue, EMR, Redshift, and IAM. Proficient in ApacheSpark (batch and/or streaming) and big data processing. Solid experience with SQL and performance tuning in data warehouse environments. Hands-on More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one. Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals. Familiarity with CI/CD for production deployments. Working knowledge of MLOps. Design and deployment … data, analytics, and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake, and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
Skills: Experience working within the public sector. Knowledge of cloud platforms (e.g., IBM Cloud, AWS, Azure). Familiarity with big data processing frameworks (e.g., ApacheSpark, Hadoop). Understanding of data warehousing concepts and experience with tools like IBM Cognos or Tableau. Certifications:While not required, the following … beneficial: Experience working within the public sector. Knowledge of cloud platforms (e.g., IBM Cloud, AWS, Azure). Familiarity with big data processing frameworks (e.g., ApacheSpark, Hadoop). Understanding of data warehousing concepts and experience with tools like IBM Cognos or Tableau. ABOUT BUSINESS UNIT IBM Consulting is More ❯
AI solutions using the Databricks Lakehouse (Delta Lake, Unity Catalog, MLflow). Design and lead the development of modular, high-performance data pipelines using ApacheSpark and PySpark. Champion the adoption of Lakehouse architecture (bronze/silver/gold layers) to ensure scalable, governed data platforms. Collaborate with … monitoring across data workloads. Mentor engineering teams and support architectural decisions as a recognised Databricks expert. Essential Skills & Experience: Demonstrable expertise with Databricks and ApacheSpark in production environments. Proficiency in PySpark, SQL, and working within one or more cloud platforms (Azure, AWS, or GCP). In-depth More ❯
to non-technical and technical audiences alike Passion for collaboration, life-long learning, and driving business value through ML Preferred Experience working with Databricks & ApacheSpark to process large-scale distributed datasets About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide - including … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide More ❯
driving business value through ML Company first focus and collaborative individuals - we work better when we work together. Preferred Experience working with Databricks and ApacheSpark Preferred Experience working in a customer-facing role About Databricks Databricks is the data and AI company. More than 10,000 organizations … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
bradford, yorkshire and the humber, United Kingdom
Pyramid Consulting, Inc
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
leeds, west yorkshire, yorkshire and the humber, United Kingdom
Pyramid Consulting, Inc
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
years of hands-on experience with big data tools and frameworks. Technical Skills: Proficiency in SQL, Python, and data pipeline tools such as Apache Kafka, ApacheSpark, or AWS Glue. Problem-Solving: Strong analytical skills with the ability to troubleshoot and resolve data issues. Communication: Excellent communication More ❯
AWS Certified Data Analytics - Specialty or AWS Certified Solutions Architect - Associate. Experience with Airflow for workflow orchestration. Exposure to big data frameworks such as ApacheSpark, Hadoop, or Presto. Hands-on experience with machine learning pipelines and AI/ML data engineering on AWS. Benefits: Competitive salary and More ❯
Databricks. Solid understanding of ETL processes , data modeling, and data warehousing. Familiarity with SQL and relational databases. Knowledge of big data technologies , such as Spark, Hadoop, or Kafka, is a plus. Strong problem-solving skills and the ability to work in a collaborative team environment. Excellent verbal and written More ❯
experience working with relational and non-relational databases (e.g. Snowflake, BigQuery, PostgreSQL, MySQL, MongoDB). Hands-on experience with big data technologies such as ApacheSpark, Kafka, Hive, or Hadoop. Proficient in at least one programming language (e.g. Python, Scala, Java, R). Experience deploying and maintaining cloud More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Smart DCC
you be doing? Design and implement efficient ETL processes for data extraction, transformation, and loading. Build real-time data processing pipelines using platforms like Apache Kafka or cloud-native tools. Optimize batch processing workflows with tools like ApacheSpark and Flink for scalable performance. Infrastructure Automation: Implement … Integrate cloud-based data services with data lakes and warehouses. Build and automate CI/CD pipelines with Jenkins, GitLab CI/CD, or Apache Airflow. Develop automated test suites for data pipelines, ensuring data quality and transformation integrity. Monitoring & Performance Optimization: Monitor data pipelines with tools like Prometheus More ❯
with TensorFlow, PyTorch, Scikit-learn, etc. is a strong plus. You have some experience with large scale, distributed data processing frameworks/tools like Apache Beam, ApacheSpark, or even our open source API for it - Scio, and cloud platforms like GCP or AWS. You care about More ❯
and scaling data systems. Highly desired experience with Azure, particularly Lakehouse and Eventhouse architectures. Experience with relevant infrastructure and tools including NATS, Power BI, ApacheSpark/Databricks, and PySpark. Hands-on experience with data warehousing methodologies and optimization libraries (e.g., OR-Tools). Experience with log analysis More ❯