Scala). Experience with streaming technologies such as Spark Streaming, Flink, or Apache Bean. Experience with Kafka is a plus. Working experience with various NoSQL databases such as Cassandra, HBase, MongoDB, and/or Couchbase. Would be a plus if you have prior Machine Learning or Deep Learning knowledge (this will be learned in the job). You will More ❯
Job Description: Scala/Spark • Good Big Data resource with the below Skillset: Java Big data technologies. • Linux Based Hadoop Ecosystem (HDFS, Impala, Hive, HBase, etc.) • Experience in Big data technologies, real time data processing platform (Spark Streaming) experience would be an advantage. • Consistently demonstrates clear and concise written and verbal communication • A history of delivering against agreed objectives More ❯
Job Description: Scala/Spark • Good Big Data resource with the below Skillset: Java Big data technologies. • Linux Based Hadoop Ecosystem (HDFS, Impala, Hive, HBase, etc.) • Experience in Big data technologies, real time data processing platform (Spark Streaming) experience would be an advantage. • Consistently demonstrates clear and concise written and verbal communication • A history of delivering against agreed objectives More ❯
using git, or similar version control systems Experience with extract, transform and load (ETL) development for data ingest pipelines Experience with relational databases, MySQL, NiFi, Kafka, Elastic MapReduce (EMR) Hbase, Elastic, Splunk, Spring Experience with CI/CD pipelines Ability to, or demonstrated experience, researching and evaluating the latest emerging technologies and applying them Experience providing analytical judgement and More ❯
benefits (e.g. UK pension schema) What do you offer? Strong hands-on experience working with modern Big Data technologies such as Apache Spark, Trino, Apache Kafka, Apache Hadoop, ApacheHBase, Apache Nifi, Apache Airflow, Opensearch Proficiency in cloud-native technologies such as containerization and Kubernetes Strong knowledge of DevOps tools (Terraform, Ansible, ArgoCD, GitOps, etc.) Proficiency in software development More ❯
results Experience: 5+ years of relevant professional experience Strong experience with Spark and Airflow Experience with Hadoop (or similar) Ecosystem, S3, DynamoDB, MapReduce, Yarn, HDFS, Hive, Spark, Presto, Pig, HBase, Parquet Strong skills in a scripting language (Python, Bash, etc) Experience of driving building complex data models and pipelines Proficient in at least one of the SQL languages (MySQL More ❯
/Lead Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS, Azure More ❯
/Lead Data Engineer in complex enterprise environments. Strong coding skills in Python (Scala or functional languages a plus). Expertise with Databricks, Apache Spark, and Snowflake (HDFS/HBase also useful). Experience integrating large, messy datasets into reliable, scalable data products. Strong understanding of data modelling, orchestration, and automation. Hands-on experience with cloud platforms (AWS, Azure More ❯
was created by experts with decades of experience building tools for data science and machine learning. From co-authors of pandas to Apache PMC of HDFS, Arrow, Iceberg and HBase, the LanceDB team has created open-source tools used by millions worldwide. More ❯
reliability or DevOps Experience with Kubernetes and Istio for on-premise deployment Experience with in-stream, data processing and analytics using open source platforms such as Apache Kafka, Spark, HBase, HDFS, Flink Experience troubleshooting hardware and network-layer issues Programming experience in Python, C#, Java, Scala, Go or similar languages Good understanding of version control, testing, continuous integration, build More ❯
focused: Python, Spark, and Hadoop Job Duties : 6-7+ Years experience working in Data Engineering and Data Analysis. Hands on Experience in Hadoop Stack of Technologies ( Hadoop ,PySpark, HBase, Hive , Pig , Sqoop, Scala ,Flume, HDFS , Map Reduce). Hands on experience with Python & Kafka . Good understanding of Database concepts , Data Design , Data Modeling and ETL. Hands on More ❯
This is a FTE role with Incedo. Data Engineer ( 2 openings) 8+ Years experience working in Data Analysis . Hands on Experience in Hadoop Stack of Technologies ( Hadoop ,Spark, HBase, Hive , Pig , Sqoop, Scala ,Flume, HDFS , Map Reduce). Hands on experience with Python & Kafka . Good understanding of Database concepts , Data Design , Data Modeling and ETL. Hands on More ❯
Cleveland, OH, Pittsburgh, PA, Dallas, TX. Job Description: 8-10 Years experience working in Data Engineering and Data Analysis. Hands on Experience in Hadoop Stack of Technologies (Hadoop, PySpark, HBase, Hive, Pig, Sqoop, Scala, Flume, HDFS, Map Reduce Hands on experience with Python & Kafka. Good understanding of Database concepts, Data Design, Data Modeling and ETL. Hands on in analyzing More ❯
data quality, monitoring, and observability tools Experience with autonomous vehicle data or sensor data processing Experience with large scale streaming platforms (e.g. Kafka, Kinesis) and storage engines (e.g. HDFS, HBase) Compensation: The monthly salary range for this position is $5,500 to $9,500. Compensation will vary based on geographic location and level of education. Additional benefits may include More ❯
consuming web services via SOAP and REST; Developing applications using Java, J2EE, and Linux shell scripting; Implementing Tomcat, JBoss, RabbitMq, MySQL, SQL Server, Oracle, Hive, Hadoop, Spark, Kafka, MongoDB, HBase, SOAP and RESTful web services; Implementing Agile application development methodology; and Implementing programming languages, software, applications, and technologies including Java, J2EE, spring framework, JPA, and Quartz Scheduler. 100% telecommuting More ❯
of experience: Knowledge of machine learning algorithms. Experience with deep learning frameworks, such as TensorFlow, PyTorch, etc. Extensive experience with big data and distributed computing systems such as Spark, HBase, Hadoop, Iceberg. Significant contributions to open source projects. Experience working with public clouds, especially AWS. More information To learn more about the work we do to check if it More ❯
San Diego, California, United States Hybrid/Remote Options
Cordial Experience, LLC
NumPy, and MatLab; Experience with data visualization tools, including D3.js and GGplot; Experience using query languages, including SQL, Hive, and Pig; Experience with NoSQL databases, including MongoDB, Cassandra, and HBase; Experience with applied statistics skills, including distributions, statistical testing, and regression; Experience with Dagster or Airflow for ML pipeline management; Experience with Python; and Experience with cloud environments, including More ❯
10025 - Sr. Big Data Engineer Job Summary Design, build, and maintain the Information/Proposed Changes platform that enables large-scale data processing and analysis. Responsible for developing and maintaining data pipelines, data lakes, and other data-related platform. Work More ❯
of AB testing framework. Build services which respond to batch and real-time data to safely rollout features and experiments using technology stack of AB testing, Hadoop, Spark, Flink, Hbase, Druid, Python, Java, Distributed Systems, React and statistical analysis. Work closely with partners to implement sophisticated statistical methodology into the platform. Telecommuting is permitted. Minimum Requirements: Masters degree (or More ❯
Server, MySQL, BigQuery, MongoDB) required. Two years of experience with distributing computing principles and programming languages (i.e., Python, Java, C) required. Two years of experience with NoSQL databases (i.e., HBase, Spark, Hive, Cloudera) required. One year of experience with Google Cloud Platform required. In lieu of a Bachelors degree in Computer Science, Statistics, or a related technical field and More ❯
of the following skills: . At least ten years of demonstrated experience with relational databases such as Microsoft SQL Server or Oracle, and NoSQL databases such as MongoDB, CouchDB, HBase, Cosmos DB At least seven years of experience evaluating, working in production environments with cloud data systems, including RDBMs, data warehouses, data lakes, data pipelines Experience with RDBMs, NoSQL More ❯
our dynamic team. The ideal candidate will have a strong background in developing batch processing systems, with extensive experience in the Apache Hadoop ecosystem (Map Reduce, Oozie, Hive, Pig, HBase, Storm). This role involves working in Java, and working on Machine Learning pipelines for data collection or batch inference. This is a remote position, requiring excellent communication skills … Location: US-Remote What you will be doing: Develop scalable and robust code for large scale batch processing systems using Hadoop, Oozie, Pig, Hive, Map Reduce, Spark (Java), Python, Hbase Develop, manage, and maintain batch pipelines supporting Machine Learning workloads Leverage GCP for scalable big data processing and storage solutions Implementing automation/DevOps best practices for CI/… CD, IaC, etc. Requirements: Proficiency in in the Hadoop ecosystem with Map Reduce, Oozie, Hive, Pig, HBase, Storm Strong programming skills with Java, Python, and Spark Knowledge in public cloud services, particularly in GCP. Experienced in Infrastructure and Applied DevOps principles in daily work. Utilize tools for continuous integration and continuous deployment (CI/CD), and Infrastructure as Code More ❯
provide standard interfaces for analytics on large security-related datasets, and lead innovation by implementing data-centric technologies. You'll work with Big Data Technologies such as Hadoop/HBase, Cassandra, and BigTable while managing ETL processes and performing system administration and performance tuning. This role is contingent on contract award. What we are looking for: Associate: Bachelor's … or Master's degree with 8+ years of experience. Recognized authority providing innovative solutions to complex technical problems and leading advanced development efforts. Experience with Big Data technologies (Hadoop, HBase, Cassandra, BigTable), ETL processes, data platform architecture, cloud computing infrastructure (AWS), programming languages (Java, Python), and distributed systems design. Highly preferred: Open source project contributions. Experience with distributed RDBMS More ❯