one of the biggest financial services organisations in the world. The successful candidate will be responsible for designing, building, and maintaining data pipelines using Apache Spark and Scala. This includes tasks like: Extracting data from various sources (databases, APIs, files) Transforming and cleaning the data Loading the data into … Big Data technologies: You'll likely work with various Big Data technologies alongside Spark, including: Hadoop Distributed File System (HDFS) for storing large datasets Apache Kafka for real-time data streaming ApacheHive for data warehousing on top of HDFS Cloud platforms like AWS, Azure, or GCP … of IT Experience with designing, building, and maintaining data pipelines. At least 4+ Years of experience with designing, building, and maintaining data pipelines using Apache Spark and Scala Programming languages: Proficiency in Scala and Spark is essential. Familiarity with Python and SQL is often a plus. Big Data technologies more »
mining, Data warehousing, ETL Experience in handling large volumes of data on SQL, NoSQL and Big Data databases Experience in Hadoop ecosystem: Hadoop, Spark, Hive, and/or Scala Experience in programming languages: PHP, Python, C++/Java Experience in Web development in Laravel MVC Framework Comfortable working in more »
engineering technology stack compatible with AWS. Experience with web scraping and other data ingestion methods and tools. Knowledge of distributed computing frameworks (Hadoop, Spark, Hive, Presto). Experience with data orchestration tools (Airflow, Orchestra, Azkaban). Expertise in cloud data warehousing and core data modelling concepts. Proficiency in version more »
software engineering, computer science or a similar field. Comfortable programming in Python and Scala (or Java) Knowledgeable in Big Data technologies, in particular Hadoop, Hive, and Spark. Experience in building real-time applications, preferably in Spark Good understanding of machine learning pipelines and machine learning frameworks such as TensorFlow more »
rapid prototyping and disciplined software development processes Experience with Python, ML libraries (e.g. spaCy, NumPy, SciPy, Transformers, etc.)data tools and technologies (Spark, Hadoop, Hive, Redshift, SQL), and toolkits for ML and deep learning (SparkML, Tensorflow, Keras) Demonstrated ability to work on multi-disciplinary teams with diverse skillsets Deploying more »
in processing data with top Cloud platforms such as GCP, Azure, AWS. Have experience and/or interest in Big Data technologies such as Hive, Spark, NiFi, HBASE, HDFS, Kafka, Kudu. Have experience in leading and managing large accounts, contributing to technical solutions development involving client/stakeholder engagement more »
source-to-target mappings) to testing and service optimisation.) Good familiarity with our developing key services/applications - AmazonRDS, Amazon DynamoDB, AWS Glue, MapReduce, Hive, Spark, YARN, Airflow. Ability to work with a range of structured, semi-structured and unstructured file formats including Parquet, json, csv, pdf, jpg. Accomplished more »
Azure SQL Data Warehouse, Azure Data Lake,AWS S3,AWS RDS,AWS Lambda or similar Have experience with Open Source big data products i.e.Hadoop Hive, Pig, Impala or similar Have experience with Open Source non-relational or NoSQL data repositories such as:MongoDB, Cassandra, Neo4J or similar Be confident more »
Newcastle upon Tyne, Northumberland, United Kingdom
Confidential
SQL Data Warehouse, Azure Data Lake, AWS S3,AWS RDS,AWS Lambda or similar Have experience with Open Source big data products i.e. Hadoop Hive, Pig, Impala or similar Have experience with Open Source non-relational or NoSQL data repositories such as: MongoDB, Cassandra, Neo4J or similar Be confident more »
concepts and technologies, including star and snowflake schema designs. Big Data Technologies: Understanding of big data platforms such as Hadoop, Spark, and tools like Hive, Pig, and HBase. Data Integration: Ability to integrate data from disparate sources using middleware or integration tools as well as streaming capabilities such as more »
Min 7yrs with Python Big Data & Data lake solutions; PostgreSQL, Clickhouse or SnowFlake etc Cloud Infrasutcurre (AWS services) Data processing pipelines using Kafka, Hadoop, Hive, Storm, or Zookeeper Hands-on team leadership The Reward Joining a fast-growth, successful blockchain business. The role offers fully remote work, a great more »
Ready to utilise your expertise in the Apache Hadoop ecosystem? Are you passionate about Big Data applications? Join us as a Hadoop Architect! Careers at TCS: It means more TCS is a purpose-led transformation company, built on belief. We do not just help businesses to transform through technology. … every day. Gain exposure to innovative technology. Guide and collaborate with some of the brightest global minds in the industry. The Role: As an Apache Hadoop and Cloud Architect , you will be responsible for the end-to-end platform engineering and migration strategy for Big Data applications. You will … leverage your extensive hands-on experience with Apache Hadoop ecosystem, Open-Source Apache Projects, Kubernetes, and Security protocols. Key Responsibilities: Develop robust architectures and designs for big data platform and applications within the Apache Hadoop ecosystem. Implement and deploy big data platform and solutions on-premises and more »
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Damia Group Ltd
Spark Scala Developer - Scala/Apache Spark - Hybrid/Leeds - £450-£550 Spark Scala Developer to join our client, one of the biggest financial services organizations in the world, with operations in more than 38 countries. It has an IT infrastructure of 200,000+ servers, 20,000+ database instances … Engineer you will be working for the GDT (Global Data Technology) Team, you will be responsible for: Designing, building, and maintaining data pipelines using Apache Spark and Scala Working on an Enterprise scale Cloud infrastructure and Cloud Services in one of the Clouds (GCP). Mandatory Skills; At least … IT Experience with designing, building, and maintaining data pipelines . At least 4+ Years of experience with designing, building, and maintaining data pipelines using Apache Spark and Scala. Programming languages: Proficiency in Scala and Spark is essential. Familiarity with Python and SQL is often a plus. Big Data technologies more »
and analytical role Experience of Data Lake/Hadoop platform implementation Hands-on experience in implementation and performance tuning Hadoop/Spark implementations Experience Apache Hadoop and the Hadoop ecosystem Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, Hcatalog, Solr, Avro) Experience with one … or more SQL-on-Hadoop technology (Hive, Impala, Spark SQL, Presto) Experience developing software code in one or more programming languages (Java, Python, etc.) Preferred Qualifications Masters or PhD in Computer Science, Physics, Engineering or Math Hands on experience leading large-scale global data warehousing and analytics projects Ability more »
of experience of Data Lake/Hadoop platform implementation Good level hands-on experience in implementation and performance tuning Hadoop/Spark implementations Experience Apache Hadoop and the Hadoop ecosystem Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, Hcatalog, Solr, Avro) Experience with one … or more SQL-on-Hadoop technology (Hive, Impala, Spark SQL, Presto) Experience developing software code in one or more programming languages (Java, Python, etc.) Preferred Qualifications: Masters or PhD in Computer Science, Physics, Engineering or Maths Hands on experience leading large-scale global data warehousing and analytics projects Ability more »
SQL Data Warehouse, Azure Data Lake, Azure Databricks Azure Cosmos DB, Azure Data Factory, Azure Search, Azure Stream Analytics Delta Lake and Data Lakes Apache Spark Pools, SQL Pools (dpools and spools) Experience in Python, C# coding, Spark, PySpark, Unix shell/Perl scripting experience. Experience in API data … as part of high-volume data ingestion and transformation pipelines. Data Governance, Data Quality, MDM, Lineage, Data Catalog etc. Development experience using Presto/Hive, Digdag, YAML. About Clarion Events Clarion Events is one of the world s leading event organisers, producing and delivering innovative and market-leading events more »
Data Architect | Up to £90k Basic | London | Global Consultancy Join a global powerhouse consultancy as a Big Data Architect, leveraging your expertise in the Apache Hadoop ecosystem to drive transformative data initiatives. This role offers a platform to work on high-impact projects with some of the brightest minds … that shape the future of big data environments. 🔍 Role Requirements: Proven experience in architecting, designing, and deploying big data platforms and applications using the Apache Hadoop ecosystem in hybrid and private cloud scenarios. Expertise in hybrid cloud big data platform designs and deployments, especially in AWS, Azure, or Google … Cloud Platform. Extensive knowledge of Apache Hadoop ecosystem components such as HDFS, Hive, HBase, Spark, Ranger, Kafka, and Yarn. Proficiency in Kubernetes for container orchestration. Strong understanding of security practices within big data environments. Ability to read, understand, and modify open-source code. Experience with large-scale data more »