techniques * Demonstrable knowledge of applying Data Engineering best practices (coding practices to DS, unit testing, version control, code review). * Big Data Eco-Systems, Cloudera/Hortonworks, AWS EMR, GCP DataProc or GCP Cloud Data Fusion. * NoSQL Databases. Dynamo DB/Neo4j/Elastic, Google Cloud Datastore. * BigQuery and Data more »
techniques • Demonstrable knowledge of applying Data Engineering best practices (coding practices to DS, unit testing, version control, code review). • Big Data Eco-Systems, Cloudera/Hortonworks, AWS EMR, GCP DataProc or GCP Cloud Data Fusion. • NoSQL Databases. Dynamo DB/Neo4j/Elastic, Google Cloud Datastore. • BigQuery and Data more »
Experience building data lakes and data pipelines in cloud using Azure and Databricks or similar tools. Spark Developer certification from any of (Databricks, MAPR, Cloudera or Hortonworks) is added advantage but not required Practice with Unix command line tools Familiarity working with agile methodology Strong database and data analysis skills more »
visualization – Tools like Tableau Master data management (MDM) – Concepts and expertise in tools like Informatica & Talend MDM Big data – Hadoop eco-system, Distributions like Cloudera/Hortonworks, Pig and HIVE Data processing frameworks – Spark & Spark streaming Hands-on experience with multiple databases like PostgreSQL, Snowflake, Oracle, MS SQL Server, NOSQL more »
Greater London, England, United Kingdom Hybrid / WFH Options
First Derivative
development and the opportunity to design your own path. We support a variety of external training courses and accreditations such as AWS, GCP, Azure, Cloudera to name a few and are truly passionate about our Mentor Program, through which our senior colleagues generously set aside personal time to coach and more »
Greater London, England, United Kingdom Hybrid / WFH Options
InterEx Group
experience in Big Data implementation projects Experience in the definition of Big Data architecture with different tools and environments: Cloud (AWS, Azure and GCP), Cloudera, No-sql databases (Cassandra, Mongo DB), ELK, Kafka, Snowflake, etc. Past experience in Data Engineering and data quality tools (Informatica, Talend, etc.) Previous involvement in more »
implement and manage data lake/data warehouse platforms. (Some of the following types of providers: AWS, Microsoft Azure, Google Cloud Platform, Databricks, Snowflake, Cloudera, Spark, MongoDB) Done this at companies using high volumes of data, ideally in retailing. Other sectors where used high volume data would also be relevant more »
attention to detail, and problem-solving ability to produce high-quality data solutions and products. Experience with analysing very large datasets , ideally using the Cloudera Data Platform. Experience in asset management with strong knowledge of asset management/accounting type data. Well-versed in data modelling, error-handling, version control more »
hierarchical approach. Your clients will include high profile brands across a range of sectors in the technology space including large technology companies like Dynatrace, Cloudera and HCL; to exciting high-growth businesses like Harness.io, Spryker and WalkMe; to not-for-profits such as the Chartered Institute of Information Security Professionals more »
make corrective recommendations. Monitoring – Be able to monitor Spark jobs using wider tools such as Grafana to see whether there are Cluster level failures. Cloudera (CDP) – Knowledge of understanding how Cloudera Spark is set up and how the run time libraries are used by PySpark code. Prophecy – High level understanding more »