with TensorFlow, PyTorch, Scikit-learn, etc. is a strong plus. You have some experience with large scale, distributed data processing frameworks/tools like ApacheBeam, Apache Spark, or even our open source API for it - Scio, and cloud platforms like GCP or AWS. You care about More ❯
architectures for ML frameworks in complex problem spaces in collaboration with product teams Experience with large scale, distributed data processing frameworks/tools like ApacheBeam, Apache Spark, and cloud platforms like GCP or AWS Where You'll Be We offer you the flexibility to work where More ❯
Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis … years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF More ❯
Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis … years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF More ❯
Databricks Must Have Hands on experience on at least 2 Hyperscalers (GCP/AWS/Azure platforms) and specifically in Big Data processing services (Apache Spark, Beam or equivalent). In-depth knowledge on key technologies like Big Query/Redshift/Synapse/Pub Sub/Kinesis … years’ experience in a similar role. Ability to lead and mentor the architects. Mandatory Skills [at least 2 Hyperscalers] GCP, AWS, Azure, Big data, Apache spark, beam on BigQuery/Redshift/Synapse, Pub Sub/Kinesis/MQ/Event Hubs, Kafka Dataflow/Airflow/ADF More ❯
highest data throughput are implemented in Java. Within Data Engineering, Dataiku, Snowflake, Prometheus, and ArcticDB are used heavily. Kafka is used for data pipelines, ApacheBeam for ETL, Bitbucket for source control, Jenkins for continuous integration, Grafana + Prometheus for metrics collection, ELK for log shipping and monitoring More ❯
includes a firm-wide data catalogue detailing key data elements. Skills Required Java and/or Scala Backends Distributed compute concepts: Spark, Data Bricks, ApacheBeam etc Full Stack experience Azure Cloud Technologies Angular/Similar front end Cloud DB/Relational DB: Snowflake/Sybase Unix experience More ❯
or all of the services below would put you at the top of our list: Google Cloud Storage. Google Data Transfer Service. Google Dataflow (ApacheBeam). Google PubSub. Google CloudRun. BigQuery or any RDBMS. Python. Debezium/Kafka. dbt (Data Build tool). Interview process Interviewing is More ❯
or all of the services below would put you at the top of our list Google Cloud Storage Google Data Transfer Service Google Dataflow (ApacheBeam) Google PubSub Google CloudRun BigQuery or any RDBMS Python Debezium/Kafka dbt (Data Build tool) Interview process Interviewing is a two More ❯
Python and SQL. Appreciation of good software engineering practice e.g. version control, test driven development. Hands on experience using big data technologies such as ApacheBeam or spark. Knowledge of techniques like embedding highly desirable. GCP (or other cloud platform) experience is desirable. What's in it for More ❯
Are You have proven experience in data engineering, including creating reliable, efficient, and scalable data pipelines using data processing frameworks such as Scio, DataFlow, Beam or equivalent. You are comfortable working with large datasets using SQL and data analytics platforms such as BigQuery. You are knowledgeable in cloud-based More ❯