services. Hands-on experience with data warehousing tools (e.g., Snowflake, Redshift, BigQuery), Databricks running on multiple cloud platforms (AWS, Azure and GCP) and data lake technologies (e.g., S3, ADLS, HDFS). Expertise in containerization and orchestration tools like Docker and Kubernetes. Knowledge of MLOps frameworks and tools (e.g., MLflow, Kubeflow, Airflow). Experience with real-time streaming architectures (e.g., Kafka More ❯
a Senior/Lead Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS More ❯
a Senior/Lead Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS More ❯
who can do in person with client) Duration: 12 Months Required Skills:Programming Languages: Strong proficiency in Python, Java, and SQL. Big Data Frameworks: Deep understanding of Hadoop ecosystem (HDFS, MapReduce, Hive, Spark Cloud Data Warehousing: Expertise in Snowflake architecture, data manipulation, and query optimization. Data Engineering Concepts: Knowledge of data ingestion, transformation, data quality checks, and data security practices. More ❯
Job Description: Scala/Spark • Good Big Data resource with the below Skillset: Java Big data technologies. • Linux Based Hadoop Ecosystem (HDFS, Impala, Hive, HBase, etc.) • Experience in Big data technologies, real time data processing platform (Spark Streaming) experience would be an advantage. • Consistently demonstrates clear and concise written and verbal communication • A history of delivering against agreed objectives • Ability More ❯
Job Description: Scala/Spark • Good Big Data resource with the below Skillset: Java Big data technologies. • Linux Based Hadoop Ecosystem (HDFS, Impala, Hive, HBase, etc.) • Experience in Big data technologies, real time data processing platform (Spark Streaming) experience would be an advantage. • Consistently demonstrates clear and concise written and verbal communication • A history of delivering against agreed objectives • Ability More ❯
premise and cloud solutions. Experience in designing and developing real time data processing pipelines Expertise in working with Hadoop data platforms and technologies like Kafka, Spark, Impala, Hive and HDFS in multi-tenant environments Expert in Java programming ,SQL and shell script, DevOps Good understanding of current industry landscape and trends on tools to ingestion, processing, consumption and security management More ❯
CDP) and CDP Services and Big data knowledge. Proficiency in Terraform for infrastructure as code (IaC). Strong hands-on experience with Cloudera CDP and Hadoop ecosystem (Hive, Impala, HDFS, etc.) Experience with GitHub Actions or similar CI/CD tools (e.g., Jenkins, GitLab CI). Solid scripting skills in Shell and Python. Extensive experience in designing, provisioning, deploying and More ❯
CDP) and CDP Services and Big data knowledge. Proficiency in Terraform for infrastructure as code (IaC). Strong hands-on experience with Cloudera CDP and Hadoop ecosystem (Hive, Impala, HDFS, etc.) Experience with GitHub Actions or similar CI/CD tools (e.g., Jenkins, GitLab CI). Solid scripting skills in Shell and Python. Extensive experience in designing, provisioning, deploying and More ❯
months Contract Schedule: Monday to Friday Must Haves: 5+ years of professional experience in data?engineering, ETL development or Hadoop development 3+ years working with Hadoop ecosystem (HDFS, MapReduce, Hive, Spark) 3+ years of Informatica PowerCenter (or similar ETL tool) design & implementation Prior hands-on with BusinessObjects Administration Proficient in SQL and shell scripting (Unix/Linux) Compensation Hourly Rate More ❯
infrastructure (VPC, IAM, S3, EC2, networking). The resource will focus on Cloudera cluster migration, data pipeline reconfiguration, and operational stability. Key Responsibilities Replicate and configure existing Cloudera cluster (HDFS, YARN, Hive, Spark) in the new AWS account. Coordinate with project team to ensure proper infrastructure provisioning (EC2, security groups, IAM roles, and networking). Reconfigure cluster connectivity and job … Bachelor's degree in computer science, Information Systems, or a related field. 7+ years of experience in data engineering or big data development 4+ years' experience with Cloudera platform (HDFS, YARN, Hive, Spark, Oozie) Experience deploying and operating Cloudera workloads on AWS (EC2, S3, IAM, CloudWatch) Strong proficiency in Scala, Java and HiveQL; Python or Bash scripting experience preferred Strong More ❯