diverse sources, transform it into usable formats, and load it into data warehouses, data lakes or lakehouses. Big Data Technologies: Utilize big data technologies such as Spark, Kafka, and Flink for distributed data processing and analytics. Cloud Platforms: Deploy and manage data solutions on cloud platforms such as AWS, Azure, or Google Cloud Platform (GCP), leveraging cloud-native services More ❯
Advanced skills in Python or another major language; writing clean, testable, production-grade ETL code at scale. Modern Data Pipelines: Experience with batch and streaming frameworks (e.g., Apache Spark, Flink, Kafka Streams, Beam), including orchestration via Airflow, Prefect or Dagster. Data Modeling & Schema Management: Demonstrated expertise in designing, evolving, and documenting schemas (OLAP/OLTP, dimensional, star/snowflake More ❯
projects. • Experience with distributed systems and microservices architectures. • Knowledge of security best practices and compliance requirements. • Experience with real-time data processing and streaming platforms (e.g., Apache Kafka, ApacheFlink). • Familiarity with chaos engineering principles and tools. More ❯
are recognised by industry leaders like Gartner's Magic Quadrant, Forrester Wave and Frost Radar. Our tech stack: Superset and similar data visualisation tools. ETL tools: Airflow, DBT, Airbyte, Flink, etc. Data warehousing and storage solutions: ClickHouse, Trino, S3. AWS Cloud, Kubernetes, Helm. Relevant programming languages for data engineering tasks: SQL, Python, Java, etc. What you will be doing More ❯
with demonstrated ability to solve complex distributed systems problems independently Experience building infrastructure for large-scale data processing pipelines (both batch and streaming) using tools like Spark, Kafka, ApacheFlink, Apache Beam, and with proprietary solutions like Nebius Experience designing and implementing large-scale data storage systems (feature stores, timeseries DBs) for ML use cases, with strong familiarity with More ❯
in a high-ownership, fast-paced environment. Nice to have: Experience working in the Payments, Fintech, or Financial Crime domain (e.g., fraud detection, AML, KYC). Experience with ApacheFlink or other streaming data frameworks is highly desirable. Experience working in teams, building and maintaining data science and AI solutions. Experience integrating with third-party APIs, especially in regulated More ❯
Hands-on experience with SQL, Data Pipelines, Data Orchestration and Integration Tools Experience in data platforms on premises/cloud using technologies such as: Hadoop, Kafka, Apache Spark, ApacheFlink, object, relational and NoSQL data stores. Hands-on experience with big data application development and cloud data warehousing (e.g. Hadoop, Spark, Redshift, Snowflake, GCP BigQuery) Expertise in building data More ❯
or in a similar role Technical expertise with data models Great numerical and analytical skills Experience with event-driven and streaming data architectures (using technologies such as Apache Spark, Flink or similar) Degree in Computer Science, IT, or similar field; a Master's is a plus or four years' equivalent experience Taptap Values Impact first Team next Accept reality More ❯
with big data technologies ( e.g. , Spark, Hadoop)Background in time-series analysis and forecastingExperience with data governance and security best practicesReal-time data streaming is a plus (Kafka, Beam, Flink)Experience with Kubernetes is a plusEnergy/maritime domain knowledge is a plus What We Offer Competitive salary commensurate with experience and comprehensive benefits package (medical, dental, vision) Significant More ❯
with a focus on data quality and reliability. Design and manage data storage solutions, including databases, warehouses, and lakes. Leverage cloud-native services and distributed processing tools (e.g., ApacheFlink, AWS Batch) to support large-scale data workloads. Operations & Tooling Monitor, troubleshoot, and optimize data pipelines to ensure performance and cost efficiency. Implement data governance, access controls, and security … pipelines and data architectures. Hands-on expertise with cloud platforms (e.g., AWS) and cloud-native data services. Comfortable with big data tools and distributed processing frameworks such as ApacheFlink or AWS Batch. Strong understanding of data governance, security, and best practices for data quality. Effective communicator with the ability to work across technical and non-technical teams. Additional … following prior to applying to GSR? Experience level, applicable to this role? Select How many years have you designed, built, and operated stateful, exactly once streaming pipelines in ApacheFlink (or an equivalent framework such as Spark Structured Streaming or Kafka Streams)? Select Which statement best describes your hands on responsibility for architecting and tuning cloud native data lake More ❯
join a fast-growing team that plays an integral part of the revenue producing arm of a company, then our team is for you. Technologies include Scala, Python, ApacheFlink, Spark, Databricks, and AWS (ECS, Lambda, DynamoDB, WAF, among others). Experience in these areas is preferred but not required. Qualifications: You collaborate with team members and project management More ❯
Go, Julia etc.) •Experience with Amazon Web Services (S3, EKS, ECR, EMR, etc.) •Experience with containers and orchestration (e.g. Docker, Kubernetes) •Experience with Big Data processing technologies (Spark, Hadoop, Flink etc) •Experience with interactive notebooks (e.g. JupyterHub, Databricks) •Experience with Git Ops style automation •Experience with ix (e.g, Linux, BSD, etc.) tooling and scripting •Participated in projects that are More ❯
leading data and ML platform infrastructure, balancing maintenance with exciting greenfield projects. develop and maintain our real-time model serving infrastructure, utilising technologies such as Kafka, Python, Docker, ApacheFlink, Airflow, and Databricks. Actively assist in model development and debugging using tools like PyTorch, Scikit-learn, MLFlow, and Pandas, working with models from gradient boosting classifiers to custom GPT More ❯
delivering under tight deadlines without compromising quality. Your Qualifications 12+ years of software engineering experience, ideally in platform, infrastructure, or data-centric product development. Expertise in Apache Kafka, ApacheFlink, and/or Apache Pulsar. Deep understanding of event-driven architectures, data lakes, and streaming pipelines. Strong experience integrating AI/ML models into production systems, including prompt engineering More ❯
delivering under tight deadlines without compromising quality. Your Qualifications 12+ years of software engineering experience, ideally in platform, infrastructure, or data-centric product development. Expertise in Apache Kafka, ApacheFlink, and/or Apache Pulsar. Deep understanding of event-driven architectures, data lakes, and streaming pipelines. Strong experience integrating AI/ML models into production systems, including prompt engineering More ❯
stack technologies including; Java, JavaScript, TypeScript, React, APIs, MongoDB, Elastic Search, DMN, BPMN and Kubernetes leverage data-streaming technologies including Kafka CDC, Kafka topic and related technologies, EMS, ApacheFlink be able to innovate and incubate new ideas, have an opportunity to work on a broad range of problems, often dealing with large data sets, including real-time processing More ❯
and constructive feedback to foster accountability, growth, and collaboration within the team. Who You Are Experienced with Data Processing Frameworks: Skilled with higher-level JVM-based frameworks such as Flink, Beam, Dataflow, or Spark. Comfortable with Ambiguity: Able to work through loosely defined problems and thrive in autonomous team environments. Skilled in Cloud-based Environments: Proficient with large-scale More ❯
data or backend engineering, while growing the ability to work effectively across both. Experience with processing large-scale transactional and financial data, using batch/streaming frameworks like Spark, Flink, or Beam (with Scala for data engineering), and building scalable backend systems in Java. You possess a foundational understanding of system design, data structures, and algorithms, coupled with a More ❯
Java, data structures and concurrency, rather than relying on frameworks such as Spring. You have built event-driven applications using Kafka and solutions with event-streaming frameworks at scale (Flink/Kafka Streams/Spark) that go beyond basic ETL pipelines. You know how to orchestrate the deployment of applications on Kubernetes, including defining services, deployments, stateful sets etc. More ❯
Familiarity with geospatial data formats (e.g., GeoJSON, Shapefiles, KML) and tools (e.g., PostGIS, GDAL, GeoServer). Technical Skills: Expertise in big data frameworks and technologies (e.g., Hadoop, Spark, Kafka, Flink) for processing large datasets. Proficiency in programming languages such as Python, Java, or Scala, with a focus on big data frameworks and APIs. Experience with cloud services and technologies … related field. Experience with data visualization tools and libraries (e.g., Tableau, D3.js, Mapbox, Leaflet) for displaying geospatial insights and analytics. Familiarity with real-time stream processing frameworks (e.g., ApacheFlink, Kafka Streams). Experience with geospatial data processing libraries (e.g., GDAL, Shapely, Fiona). Background in defense, national security, or environmental monitoring applications is a plus. Compensation and Benefits More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
WorksHub
us achieve our objectives. So each team leverages the technology that fits their needs best. You'll see us working with data processing/streaming like Kinesis, Spark and Flink; application technologies like PostgreSQL, Redis & DynamoDB; and breaking things using in-house chaos principles and tools such as Gatling to drive load all deployed and hosted on AWS. Our More ❯
plus Experience with Terraform and Kubernetes is a plus! A genuine excitement for significantly scaling large data systems Technologies we use (experience not required): AWS serverless architectures Kubernetes Spark Flink Databricks Parquet. Iceberg, Delta lake, Paimon Terraform Github including Github Actions Java PostgreSQL About Chainalysis Blockchain technology is powering a growing wave of innovation. Businesses and governments around the More ❯
Craft: Data, Analytics & Strategy Job Description: Activision Blizzard Media is the gateway for brands to the cross-platform gaming company in the western world, with hundreds of millions of players across over 190 countries. Our legendary portfolio includes iconic mobile More ❯
to cross-functional teams, ensuring best practices in data architecture, security and cloud computing Proficiency in data modelling, ETL processes, data warehousing, distributed systems and metadata systems Utilise ApacheFlink and other streaming technologies to build real-time data processing systems that handle large-scale, high-throughput data Ensure all data solutions comply with industry standards and government regulations … not limited to EC2, S3, RDS, Lambda and Redshift. Experience with other cloud providers (e.g., Azure, GCP) is a plus In-depth knowledge and hands-on experience with ApacheFlink for real-time data processing Proven experience in mentoring and managing teams, with a focus on developing talent and fostering a collaborative work environment Strong ability to engage with More ❯