performance and responsiveness. Stay Up to Date with Technology: Keep yourself and the team updated on the latest Python technologies, frameworks, and tools like ApacheSpark , Databricks , Apache Pulsar , Apache Airflow , Temporal , and Apache Flink , sharing knowledge and suggesting improvements. Documentation: Contribute to clear and … or Azure . DevOps Tools: Familiarity with containerization ( Docker ) and infrastructure automation tools like Terraform or Ansible . Real-time Data Streaming: Experience with Apache Pulsar or similar systems for real-time messaging and stream processing is a plus. Data Engineering: Experience with ApacheSpark , Databricks , or … similar big data platforms for processing large datasets, building data pipelines, and machine learning workflows. Workflow Orchestration: Familiarity with tools like Apache Airflow or Temporal for managing workflows and scheduling jobs in distributed systems. Stream Processing: Experience with Apache Flink or other stream processing frameworks is a plus. More ❯
you'll define the path forward for data, integration, and governance , ensuring our technology aligns with business objectives. Work with cutting-edge tools like ApacheSpark, Databricks, Kafka, Airflow, and Azure , while overseeing SQL Server, ETL, data pipelines, and streaming platforms . You'll also drive automation and … from you! You Will Data Strategy & Architecture Develop and maintain the data architecture roadmap, balancing legacy and modern data solutions. Evaluate emerging technologies (e.g., ApacheSpark, Kafka) to future-proof our data landscape. Define and enforce data integration standards, ensuring consistency across systems. Solution Design & Implementation Oversee data … Warehousing: Experience with legacy ETL tools and modern data transformation strategies. Microsoft Stack: Proficiency in SQL Server, Azure SQL, and Azure-based data processing. ApacheSpark & Databricks: Strong background in large-scale data processing and analytics. Kafka & Streaming: Experience with real-time data ingestion and event-driven architectures. More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
INFUSED SOLUTIONS LIMITED
culture. Key Responsibilities Design, build, and maintain scalable data solutions to support business objectives. Work with Microsoft Fabric to develop robust data pipelines. Utilise ApacheSpark and the Spark API to handle large-scale data processing. Contribute to data strategy, governance, and architecture best practices. Identify and … approaches. Collaborate with cross-functional teams to deliver projects on time . Key Requirements ? Hands-on experience with Microsoft Fabric . ? Strong expertise in ApacheSpark and Spark API . ? Knowledge of data architecture, engineering best practices, and governance . ? DP-600 & DP-700 certifications are highly More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
learning libraries in one or more programming languages. Keen interest in some of the following areas: Big Data Analytics (e.g. Google BigQuery/BigTable, ApacheSpark), Parallel Computing (e.g. ApacheSpark, Kubernetes, Databricks), Cloud Engineering (AWS, GCP, Azure), Spatial Query Optimisation, Data Storytelling with (Jupyter) Notebooks More ❯
Working knowledge of two or more common Cloud ecosystems (AWS, Azure, GCP) with expertise in at least one. Deep experience with distributed computing with ApacheSpark and knowledge of Spark runtime internals. Familiarity with CI/CD for production deployments. Working knowledge of MLOps. Design and deployment … data, analytics, and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake, and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
Skills: Experience working within the public sector. Knowledge of cloud platforms (e.g., IBM Cloud, AWS, Azure). Familiarity with big data processing frameworks (e.g., ApacheSpark, Hadoop). Understanding of data warehousing concepts and experience with tools like IBM Cognos or Tableau. Certifications:While not required, the following … beneficial: Experience working within the public sector. Knowledge of cloud platforms (e.g., IBM Cloud, AWS, Azure). Familiarity with big data processing frameworks (e.g., ApacheSpark, Hadoop). Understanding of data warehousing concepts and experience with tools like IBM Cognos or Tableau. ABOUT BUSINESS UNIT IBM Consulting is More ❯
to non-technical and technical audiences alike Passion for collaboration, life-long learning, and driving business value through ML Preferred Experience working with Databricks & ApacheSpark to process large-scale distributed datasets About Databricks Databricks is the data and AI company. More than 10,000 organizations worldwide - including … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter ,LinkedIn and Facebook . Benefits At Databricks, we strive to provide More ❯
driving business value through ML Company first focus and collaborative individuals - we work better when we work together. Preferred Experience working with Databricks and ApacheSpark Preferred Experience working in a customer-facing role About Databricks Databricks is the data and AI company. More than 10,000 organizations … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. Benefits At Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of More ❯
driving business value through ML Company first focus and collaborative individuals - we work better when we work together. Preferred Experience working with Databricks and ApacheSpark Preferred Experience working in a customer-facing role About Databricks Databricks is the data and AI company. More than 10,000 organizations … data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, ApacheSpark, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn, and Facebook. Benefits At Databricks, we strive to provide comprehensive More ❯
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
leeds, west yorkshire, yorkshire and the humber, United Kingdom
Pyramid Consulting, Inc
We are seeking an experienced Kafka Real-Time Architect to design and implement scalable, high-performance real-time data processing systems leveraging Apache Kafka. In this role, you will be responsible for architecting and managing Kafka clusters, ensuring system scalability and availability, and integrating Kafka with various data processing … approach to addressing business data needs and ensuring optimal system performance. Key Responsibilities: Design & Architecture: Architect and design scalable, real-time streaming systems using Apache Kafka, ensuring they are robust, highly available, and meet business requirements for data ingestion, processing, and real-time analytics. Kafka Cluster Management: Configure, deploy … and troubleshoot issues to maintain smooth operations. Integration & Data Processing: Integrate Kafka with key data processing tools and platforms, including Kafka Streams , Kafka Connect , ApacheSpark Streaming , Apache Flink , Apache Beam , and Schema Registry . This integration will facilitate data stream processing, event-driven architectures, and More ❯
leaders. Ideal Candidate Profile Strong technical background in SQL, scripting languages (Python, TypeScript, JavaScript), databases, ML/LLM models, and big data technologies like ApacheSpark (PySpark, Spark SQL). Self-starter with the ability to work from requirements to solutions. Effective communicator, passionate learner, and accountable … Python or KornShell. Knowledge of query languages like SQL, PL/SQL, HiveQL, SparkSQL, Scala. Experience with big data technologies such as Hadoop, Hive, Spark, EMR. Additional Information Amazon's inclusive culture empowers employees to deliver exceptional results. For workplace accommodations during the application process, visit If your region More ❯
years of hands-on experience with big data tools and frameworks. Technical Skills: Proficiency in SQL, Python, and data pipeline tools such as Apache Kafka, ApacheSpark, or AWS Glue. Problem-Solving: Strong analytical skills with the ability to troubleshoot and resolve data issues. Communication: Excellent communication More ❯
AWS Certified Data Analytics - Specialty or AWS Certified Solutions Architect - Associate. Experience with Airflow for workflow orchestration. Exposure to big data frameworks such as ApacheSpark, Hadoop, or Presto. Hands-on experience with machine learning pipelines and AI/ML data engineering on AWS. Benefits: Competitive salary and More ❯
Databricks. Solid understanding of ETL processes , data modeling, and data warehousing. Familiarity with SQL and relational databases. Knowledge of big data technologies , such as Spark, Hadoop, or Kafka, is a plus. Strong problem-solving skills and the ability to work in a collaborative team environment. Excellent verbal and written More ❯
experience working with relational and non-relational databases (e.g. Snowflake, BigQuery, PostgreSQL, MySQL, MongoDB). Hands-on experience with big data technologies such as ApacheSpark, Kafka, Hive, or Hadoop. Proficient in at least one programming language (e.g. Python, Scala, Java, R). Experience deploying and maintaining cloud More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Smart DCC
you be doing? Design and implement efficient ETL processes for data extraction, transformation, and loading. Build real-time data processing pipelines using platforms like Apache Kafka or cloud-native tools. Optimize batch processing workflows with tools like ApacheSpark and Flink for scalable performance. Infrastructure Automation: Implement … Integrate cloud-based data services with data lakes and warehouses. Build and automate CI/CD pipelines with Jenkins, GitLab CI/CD, or Apache Airflow. Develop automated test suites for data pipelines, ensuring data quality and transformation integrity. Monitoring & Performance Optimization: Monitor data pipelines with tools like Prometheus More ❯
with TensorFlow, PyTorch, Scikit-learn, etc. is a strong plus. You have some experience with large scale, distributed data processing frameworks/tools like Apache Beam, ApacheSpark, or even our open source API for it - Scio, and cloud platforms like GCP or AWS. You care about More ❯
and scaling data systems. Highly desired experience with Azure, particularly Lakehouse and Eventhouse architectures. Experience with relevant infrastructure and tools including NATS, Power BI, ApacheSpark/Databricks, and PySpark. Hands-on experience with data warehousing methodologies and optimization libraries (e.g., OR-Tools). Experience with log analysis More ❯
industries Design and develop feature engineering pipelines, build ML & AI infrastructure, deploy models, and orchestrate advanced analytical insights Write code in SQL, Python, and Spark following software engineering best practices Collaborate with stakeholders and customers to ensure successful project delivery Who we are looking for We are looking for More ❯
SQL, Java Commercial experience in client-facing projects is a plus, especially within multi-disciplinary teams Deep knowledge of database technologies: Distributed systems (e.g., Spark, Hadoop, EMR) RDBMS (e.g., SQL Server, Oracle, PostgreSQL, MySQL) NoSQL (e.g., MongoDB, Cassandra, DynamoDB, Neo4j) Solid understanding of software engineering best practices - code reviews More ❯