Data Engineer
Title: Data Engineer
Level of experience: L5
Location: London, UK
As a Data Engineer, you will design, build, and operate scalable data pipelines and data models that customer-facing features and internal analytics. You’ll solve complex data warehousing and big-data processing challenges using AWS technologies, delivering self-service analytics, infrastructure-as-code, and high-performance ETL/ELT workflows. You will also develop automated data quality frameworks that validate accuracy, detect anomalies, and increase trust in downstream data products. In this role, you will partner closely with business, science, and engineering teams to tackle non-standard data problems and deliver high-impact solutions that scale with rapid growth and evolving business needs.
Key job responsibilities
- Build and optimize data pipelines to ingest and transform data from various sources, including traditional ETL pipelines and event data streams.
- Utilize data from disparate sources to build meaningful datasets for analytics and reporting, focusing on consolidating data from various systems.
- Implement big-data technologies (e.g., Redshift, EMR, Spark, SNS, SQS, Kinesis) to optimize processing of large datasets.
- Develop and maintain the team's data platform, including infrastructure-as-code using AWS CDK.
- Work closely with business stakeholders to understand their needs and translate them into technical solutions.
- Analyze business processes, logical data models, and relational database implementations.
- Write high-performing SQL queries.
- Design and implement automated data processing solutions and data quality controls.
- Collaborate with software engineers to support the data needs of products
- Participate in on-call rotations to support the team's products and data pipelines.
- Optimize data processing and storage solutions to improve performance and reduce costs.
Basic qualifications
- Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence
- Experience working on and delivering end to end projects independently
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
- Experience with data modeling, warehousing and building ETL pipelines
- Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
- Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets
- Experience with SQL
Preferred qualifications
- Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
- Experience with Apache Spark / Elastic Map Reduce