data governance including GDPR. Bonus Points For Expertise in Data Modelling, schema design, and handling both structured and semi-structured data. Familiarity with distributed systems such as Hadoop, Spark, HDFS, Hive, Databricks. Exposure to AWS Lake Formation and automation of ingestion and transformation layers. Background in delivering solutions for highly regulated industries. Passion for mentoring and enabling data engineering best More ❯
pipeline using Spark programming language is an added advantage.Advanced SQL experience including SQL performance tuning is a must. Should have worked on other big data frameworks such as MapReduce, HDFS, Hive/Impala, AWS Athena. Experience in logical & physical table design in Big Data environment to suite processing frameworks Knowledge of using, setting up and tuning resource management frameworks such More ❯
Java or Scala for developing scalable backend systems and data pipelines. Solid understanding of SQL and relational databases (e.g., MySQL, PostgreSQL, Hive). Familiarity with the Apache Hadoop ecosystem (HDFS, MapReduce, YARN). Working knowledge of Apache Spark and its modules (e.g., Spark SQL, Spark Streaming, MLlib). Experience with cloud-based data platforms like AWS Glue, Google Cloud Dataflow More ❯
Hive, Spark, R, Pig, Oozie workflows Hadoop ecosystem: Hive data, Oozie, Spark, Pig, Impala, Hue COTS integration: Knowi, MongoDB, Oracle, MySQL RDS, Elastic, Logstash, Kibana, Zookeeper, Consul, Hadoop/HDFS Containerization/configuration tools: Docker, Chef North Point Technology is THE BEST place to work for curious-minded engineers motivated to support our country's most crucial missions! We focus More ❯
resolution. Preferred Qualifications: Experience with visualization tools and techniques (e.g., Periscope, Business Objects, D3, ggplot, Tableau, SAS Visual Analytics, PowerBI). Experience with big data technologies (e.g., Hadoop, HIVE, HDFS, HBase, MapReduce, Spark, Kafka, Sqoop). Master's degree in mathematics, statistics, computer science/engineering, or other related technical fields with equivalent practical experience. Experience constructing and executing queries More ❯
method using parallel computing frameworks (e.g. deeplearing4j, Torch, Tensor Flow, Caffe, Neon, NVIDIA CUDA Deep Neural Network library (cuDNN), and OpenCV) and distributed data processing frameworks (e.g. Hadoop (including HDFS, Hbase, Hive, Impala, Giraph, Sqoop), Spark (including MLib, GraphX, SQL and Dataframes). Execute data science method using common programming/scripting languages: Python, Java, Scala, R (statistics). Prior More ❯
HIVE, SPARK, R, PIG, OOZIE workflows), Elasticsearch, Hadoop (HIVE data, OOZIE, Spark, PIG, IMPALA, HUE), COTS Integration (Knowi, MongoDB, Oracle, MySQL RDS, Elastic, Logstash, Kibana, Zookeeper, Consul, HADOOP/HDFS), Docker and Chef Who we are: Reinventing Geospatial, Inc. (RGi) is a fast-paced small business that has the environment and culture of a start-up, with the stability and More ❯
HIVE, SPARK, R, PIG, OOZIE workflows), Elasticsearch, Hadoop (HIVE data, OOZIE, Spark, PIG, IMPALA, HUE), COTS Integration (Knowi, MongoDB, Oracle, MySQL RDS, Elastic, Logstash, Kibana, Zookeeper, Consul, HADOOP/HDFS), Docker and Chef Who we are: Reinventing Geospatial, Inc. (RGi) is a fast-paced small business that has the environment and culture of a start-up, with the stability and More ❯
lifecycle. Strong communication skills to translate stakeholder requirements into system use-cases. Experience with visualization tools (e.g., Tableau, D3, ggplot). Experience utilizing multiple big data technologies: Hadoop, Hive, HDFS, HBase, MapReduce, Spark, Kafka, Sqoop. Experience with SQL, Spark, ETL. Experience with extracting, cleaning, and transforming large transactional datasets to build predictive models and generate supporting documentation. TS/SCI More ❯
Hive, Spark, R, Pig, Oozie workflows Hadoop ecosystem: Hive data, Oozie, Spark, Pig, Impala, Hue COTS integration: Knowi, MongoDB, Oracle, MySQL RDS, Elastic, Logstash, Kibana, Zookeeper, Consul, Hadoop/HDFS Containerization/configuration tools: Docker, Chef North Point Technology is THE BEST place to work for curious-minded engineers motivated to support our country's most crucial missions! We focus More ❯
Integration project experience on Hadoop Platform, preferably Cloudera AbInitio CDC ( Change Data Capture ) experience in a Data Integration/ETL project setting is great to have Working knowledge of HDFS, Hive, Impala and other related Hadoop technologies Working knowledge in various AWS services is nice to have Sound understanding of SQL and ability to write well performing SQL queries Good More ❯
experience as a Senior Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS More ❯
experience as a Senior Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS More ❯
data movement and transformation technologies What We're Looking For 5+ years of professional experience in data?engineering, ETL development or Hadoop development 3+ years working with Hadoop ecosystem (HDFS, MapReduce, Hive, Spark) 3+ years of Informatica PowerCenter (or similar ETL tool) design & implementation Prior hands-on with (BusinessObjects Administration) or data platform tools Proficient in SQL and shell scripting More ❯
Mathematics, Engineering, or CS field CPA, CGFM, or CDFM Certification Nice If You Have: Experience with modern relational databases, including MySQL or PostgreSQL, and Big Data systems, including Hadoop, HDFS, Hive, or Cloudera Experience providing recommendations with dashboards, visualizations, or reports using BI platforms such as QlikSense or Tableau Ability to manipulate and integrate databases with languages such as SQL More ❯
Experience supporting IC or DoD in the Cyber Security Domain Familiarity with the RMF process Experience with Relational Database Management System (RDMS) Experience with Apache Hadoop and the HadoopDistributedFileSystem Experience with Amazon Elastic MapReduce (EMR) and SageMaker Experience with Machine Learning or Artificial Intelligence Travel Security Clearance Top Secret/SCI/CI Poly More ❯
other federal partners • The DTS portfolio encompasses transport streams, messages and files with content size ranging from bytes to Terabytes • Candidates should have experience writing analytics using Apache Hadoop, HDFS, and MapReduce • Experience processing large data sets or high-volume data ingest is a plus • Experience monitoring, maintaining and troubleshooting Apache Accumulo, Apache Hadoop, and Apache Zookeeper deployments is required More ❯
API, REST. Demonstrated experience working with Agile Scrum based development team DESIRED SKILLS: Familiar with Assessment & Authorization process and associated artifact collection process Familiarization with Hadoopdistributedfile systems (HDFS) Background utilizing Jira for documenting and tracking software development tasks WHAT YOU'LL NEED TO SUCCEED: Education: Bachelor's degree in Computer Science, Engineering, or a related technical discipline, or More ❯
Experience supporting IC or DoD in the Cyber Security Domain Familiarity with the RMF process Experience with Relational Database Management System (RDMS) Experience with Apache Hadoop and the HadoopDistributedFileSystem Experience with Amazon Elastic MapReduce (EMR) and SageMaker Experience with Machine Learning or Artificial Intelligence More ❯
interacting in a cross-Integrated Product Team (IPT) environment. Experience planning and integrating secure compliant solutions based on legal, policy, and compliance directives. Knowledge and experience with internals of HDFS, HADOOP, and/or HBase/Accumulo BIG TABLE. Active TS/SCI security clearance with a current polygraph is requiredPeraton offers enhanced benefits to employees working on this critical More ❯
Oak Brook, Illinois, United States Hybrid / WFH Options
Ace Hardware Corporation
clusters (Cloudera distribution), including performing backup and restore operations and supporting development, test, and production systems. Key Responsibilities Cloudera Hadoop Administration Manage and support Cloudera Hadoop clusters and services (HDFS, YARN, Hive, Impala, Spark, Oozie, etc.). Perform cluster upgrades, patching, performance tuning, capacity planning, and health monitoring. Secure the Hadoop platform using Kerberos, Ranger, or Sentry. Develop and maintain … Delta Lake architecture. Experience with IAM, Active Directory, and SSO integration. Familiarity with DevOps and CI/CD for data platforms. Deep understanding of Hadoop ecosystem: Hive, Impala, Spark, HDFS, YARN. Experience integrating data from DB2 to Hadoop/Databricks using tools like Sqoop or custom connectors. Scripting skills in Shell and/or Python for automation and system administration. More ❯
have demonstrated work experience with the Map Reduce programming model and technologies such as Hadoop, Hive, Pig, etc.; Shall have demonstrated work experience with the HadoopDistributedFileSystem (HDFS); Shall have demonstrated work experience with Serialization such as JSON and/or BSON More ❯
evaluation, enhancement, maintenance, testing, and problem diagnosis/resolution. You will work on a software development program proving software development engineering strategies for environments using HadoopDistributedFileSystem (HDFS), Map Reduce, and other related cloud technologies. You will provide set-up, configuration, and software installation for development, test, and production systems. Interface directly with development team as well as … Big Table; Convert existing algorithms or develop new algorithms to utilize the Map Reduce programming model and technologies such as Hadoop, Hive, and Pig; Support operational systems utilizing the HDFS; Support the deployment of operational systems and applications in a cloud environment; Conduct scalability assessments of Cloud-related algorithms, applications, and systems to identify performance bottlenecks and areas for improvement More ❯