s preferred). Excellent problem-solving and communication. Can be advantageous if you have: Cloud platform experience (AWS, Azure, GCP), big data tech (Hadoop, Spark), containerization (Docker, Kubernetes), DevOps and CI/CD understanding. We regret to inform you that only shortlisted candidates will be notified/contacted. more »
Greater Bristol Area, United Kingdom Hybrid / WFH Options
Anson McCade
and product development, encompassing experience in both stream and batch processing. Designing and deploying production data pipelines, utilizing languages such as Java, Python, Scala, Spark, and SQL. In addition, you should have proficiency or familiarity with: Scripting and data extraction via APIs, along with composing SQL queries. Integrating data more »
data warehouse, data lake design/building, and data movement. Design and deploy production data pipelines in Big data architecture using Java, Python, Scala, Spark, and SQL. Tasks involve scripting, API data extraction, and writing SQL queries. Comfortable designing and building for AWS cloud, encompassing Platform-as-a-Service more »
London, England, United Kingdom Hybrid / WFH Options
Ripple Labs Inc
powering machine-learning models. Have a strong background in developing distributed systems with experience in scalable data pipelines Familiar with big data technologies like Spark or Flink and comfortable in engineering data pipelines using big data technologies on financial datasets Experience with RESTful APIs and server-side APIs integration more »
Jira, Confluence Excellent communication skills ETL design, development using S QL Server Integration Services (SSIS, SSRS) and Talend Nice to have ANSI SQL and Spark SQL Nice to have work flow automation, orchestration using Airflow, Nice to have: cloud services from Amazon AWS, Google GCP and Microsoft Azure Nice more »
Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
Third Nexus Group Limited
and product development, encompassing experience in both stream and batch processing. · Designing and deploying production data pipelines, utilizing languages such as Java, Python, Scala, Spark, and SQL. In addition, you should have proficiency or familiarity with: · Scripting and data extraction via APIs, along with composing SQL queries. · Integrating data more »
field (STEM) Technical proficiency in cloud-based data solutions (AWS, Azure or GCP), engineering languages including Python, SQL, Java, and pipeline management tools e.g., Apache Airflow. Familiarity with big data technologies, Hadoop, or Spark. If this opportunity is of interest, or you know anyone who would be interested in more »
develop, and maintain high volume Java or Scala based data processing jobs using industry standard tools and frameworks in the Hadoop ecosystem, such as Spark, Kafka, Hive, Impala, Avro, Flume, Oozie, and Sqoop Design and maintain schemas in our analytics database. Excellent in writing efficient SQL for loading and … technologies, languages, and techniques in the rapidly evolving world of high-volume data processing. Technologies We Use: Development languages/frameworks : Java/Scala, ApacheSpark, Kafka, Vertica, JavaScript (React/Redux), MicroStrategy Amazon : EMR, Step Functions, SQS, LaMDA and AWS cloud-native architectures DevOps Tools : Terraform or … Cloud Formation, NewRelic, Jenkins, Grafana, PagerDuty, GitHub, GitHub Actions Database : MySQL, Vertica, DynamoDB Stream Processing : Kafka, Spark Streaming, Kinesis What We Look For: Ability to work within a dynamic team committed to excellence. Leader as team contributor to active discussion, meetings. Solid listening skills, ability to be flexible and more »
run on AWS and soon Azure, with plans to also add GCP and on-prem. They are adding extensive usage of distributed compute on Spark, starting with their more complex ETL and advanced analytics functions, e.g. Time Series Processing. They soon plan to integrate other approaches, including native distributed … PyTorch/Tensorflow, Spark-based distributor libraries, or Horovod. TECH STACK: Python, Flask, Redis, Postgres, React, Plotly, Docker. Temporal; AWS Athena SQL, Athena & EMR Spark, ECS Fargate; Azure Synapse/Data Lake Analytics, HDInsight. KEY RESPONSIBILITIES Lead the productionisation of Monolith’s ML models and data processing pipelines … both mid-low-level system and design and exemplary hands-on implementations using Spark and other tech stacks Shape the ML engineering culture and practices around model & data versioning, scalability, model benchmarking, ML-specific branching & release strategy Concisely break down complex high-level ML requirements into smaller deliverables (epic more »
Terraform/Docker/Kubernetes. Write software using either Java/Scala/Python . The following are nice to have, but not required - ApacheSpark jobs and pipelines. Experience with any functional programming language. Database design concepts. Writing and analysing SQL queries. Application overVIOOH Our recruitment team more »
Manchester, England, United Kingdom Hybrid / WFH Options
Made Tech
and able to guide how one could deploy infrastructure into different environments. Knowledge of handling and transforming various data types (JSON, CSV, etc) with ApacheSpark, Databricks or Hadoop Good understanding of possible architectures involved in modern data system design (Data Warehouse, Data Lakes, Data Meshes) Ability to more »
Bristol, England, United Kingdom Hybrid / WFH Options
Made Tech
and able to guide how one could deploy infrastructure into different environments. Knowledge of handling and transforming various data types (JSON, CSV, etc) with ApacheSpark, Databricks or Hadoop Good understanding of possible architectures involved in modern data system design (Data Warehouse, Data Lakes, Data Meshes) Ability to more »
requires candidates to go through SC Clearance, so you must be eligible. Experience of AWS tools (e.g Athena, Redshift, Glue, EMR) Java, Scala, Python, Spark, SQL Experience of developing enterprise grade ETL/ELT data pipelines. NoSQL Databases. Dynamo DB/Neo4j/Elastic, Google Cloud Datastore. Snowflake Data more »
emphasis on Pyspark and Databricks for this particular role. Technical Skills Required: Azure (ADF, Functions, Blob Storage, Data Lake Storage, Azure Data Bricks) Databricks Spark Delta Lake SQL Python PySpark ADLS Day To Day Responsibilities: Extensive experience in designing, developing, and managing end-to-end data pipelines, ETL (Extract more »
or more of the following tools: Informatica PowerCenter, SAS Data Integration Studio, Microsoft SSIS, Ab Initio, etc. • Ideally, you have experience in Hadoop ecosystem (Spark, Kafka, HDFS, Hive, HBase, …), Docker and orchestration platform (Kubernetes, Openshift, AKS, GKE...), and noSQL Databases (MongoDB, Cassandra, Neo4j) • Any experience with cloud platforms such more »
and Public Services, Healthcare, Life Sciences, and Transport. Essential Skills & Experience: Design and deploy data pipelines in big data architecture using Java, Python, Scala, Spark, and SQL. Execute tasks involving scripting, API data extraction, and SQL queries. Proficient in data cleaning, wrangling, visualization, and reporting. Specialised in AWS cloud more »
NumPy, scikit-learn). Understanding of database technologies (ETL) and SQL proficiency for data manipulation, data mining and querying. Knowledge of Big Data Tools (Spark or Hadoop a plus). Power BI, Dashboard design/development. Regulatory Awareness/Compliance Uphold Regulatory/Compliance requirements relevant to your role more »
Google Cloud Professional Cloud Architect or Professional Cloud Developer certification Very Disrable to have hands-on experience with ETL tools, Hadoop-based technologies (e.g., Spark), and batch/streaming data pipelines (e.g., Beam, Flink etc) Proven expertise in designing and constructing data lakes and data warehouse solutions utilising technologies more »
Google Cloud Professional Cloud Architect or Professional Cloud Developer certification Very Disrable to have hands-on experience with ETL tools, Hadoop-based technologies (e.g., Spark), and batch/streaming data pipelines (e.g., Beam, Flink etc) Proven expertise in designing and constructing data lakes and data warehouse solutions utilising technologies more »
DynamoDB, Aurora) Knowledge and experience with Snowflake and other databases (PostgreSQL, MS SQL Server, MySQL) Experience with Big Data Batch and Streaming technologies like Spark, Kafka, Flink, Beam, Kinesis SnowPro Certification or equivalent from AWS Comfort working within an agile development cycle and exposure to: Linux development Git and more »
/data stores (object storage, document or key-value stores, graph databases, column-family databases) • Experience with big data technologies such as: Hadoop, Hive, Spark, EMR, Snowflake, and Data Mesh principles • Team player • Proactive and resilient • A passion for social good Our Mission Statement: We are an equal opportunity more »
Southampton, Hampshire, South East, United Kingdom Hybrid / WFH Options
Leo Recruitment Limited
in programming languages and tools for data analysis, such as Python, R, and SQL You must be proficient in big data technologies, such as Spark, Kafka and/or Hadoop. A strong understanding of statistical analysis, predictive modelling, machine learning algorithms, and data development and optimisation is essential You more »
Staines-Upon-Thames, England, United Kingdom Hybrid / WFH Options
IFS
with data ingestion tools such as Airbyte and Fivetran, accommodating a wide array of data sources. Mastery of large-scale data processing techniques using Spark or Dask. Strong programming skills in Python, Scala, C#, or Java, and adeptness with cloud SDKs and APIs. Deep understanding of AI/ML more »
to ensure efficient and accurate data delivery. Optimize data workflows for performance, scalability, and cost-effectiveness. Technical Expertise: Demonstrate in-depth expertise in Databricks, ApacheSpark, and related big data technologies. Stay informed about the latest industry trends and advancements in data engineering. Quality Assurance: Conduct thorough testing … projects. Qualifications: Bachelor's degree in Computer Science, Engineering, or a related field. Proven experience in data engineering with a focus on Databricks and Apache Spark. Strong programming skills, preferably in Python or Scala. Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and associated data services. Excellent communication skills more »
and AI models. Data Engineer Required Experience Data engineering experience (2+ years) Cloud platform proficiency (e.g., AWS, Azure, GCP) Data pipeline development (e.g., Airflow, ApacheSpark) SQL proficiency, database design Visualization tools knowledge (e.g., Tableau, PowerBI, Looker) Data Engineer Application Process This is a 1 year contract requirement more »