and data pipeline orchestration tools such as Apache Airflow and Nifi. Experience with large scale/Big Data technologies, such as Hadoop, Spark, Hive, Impala, PrestoDb, Kafka. Experience with workflow orchestration tools like Apache Airflow. Experience with containerisation using Docker and deployment on Kubernetes. Experience with NoSQL and graph More ❯
streamline data workflows and reduce manual interventions. Must have: AWS, ETL, EMR, GLUE, Spark/Scala, Java, Python. Good to have: Cloudera - Spark, Hive, Impala, HDFS, Informatica PowerCenter, Informatica DQ/DG, Snowflake Erwin. Qualifications: Bachelor's or Master's degree in Computer Science, Data Engineering, or a related More ❯
data analytics on AWS platforms. Experience in writing efficient SQLs, implementing complex ETL transformations on big data platforms. Experience in Big Data technologies (Spark, Impala, Hive, Redshift, Kafka, etc.). Experience in data quality testing; adept at writing test cases and scripts, presenting and resolving data issues. Experience with More ❯
Data Mining, Classical Machine Learning, Deep Learning, NLP and Computer Vision. Experience with Large Scale/Big Data technology, such as Hadoop, Spark, Hive, Impala, PrestoDb. Hands-on capability developing ML models using open-source frameworks in Python and R and applying them on real client use cases. Proficient More ❯
the ability to work under pressure on multiple projects with differing project timeframes. SQL experience to do data analysis on Postgres, MS SQL, Hadoop Impala, etc. More ❯
and maintain Dash web application with user-friendly interface for workflow processing, data visualization, exploration, and efficient reporting. Design and implement relational databases in Impala to effectively store and manage data. Develop optimal schemas for Impala tables based on query patterns and data characteristics. Integrate Dash applications with … Impala to efficiently query and process large data sets. Implement and manage Oozie job schedulers for maintaining ETL processes to efficiently load, transform, and distribute daily data. Employ agile development practices to develop effective business solutions based on the business needs. Required Skills Education & Experience: Master's or higher … manipulation and analysis using libraries such as Pandas, NumPy, and SQLAlchemy. Extensive experience with Dash framework for building web applications. In-depth knowledge of Impala or other SQL-on-Hadoop query engines. Understanding of web development concepts (HTML, CSS, JavaScript). Proficiency in data visualization libraries (Plotly, Seaborn). More ❯