Data Engineer - £350PD - Remote

Data Engineer - £350PD - Remote

Required Technical Skills

Data Pipeline & ETL

  • Design, build, and maintain robust ETL/ELT pipelines for structured and unstructured data

  • Hands-on experience with AWS Glue and AWS Step Functions

  • Implementation of data validation, data quality frameworks, and reconciliation checks

  • Strong error handling, monitoring, and retry strategies in production pipelines

  • Experience with incremental data processing patterns (CDC, watermarking, upserts)



AWS Data Services

  • Amazon S3: data lake architectures, partitioning strategies, lifecycle policies

  • DynamoDB: data modeling, secondary indexes, streams, and performance optimization

  • Amazon Redshift: foundational querying, integrations, and performance considerations

  • AWS Lambda for scalable data processing and orchestration

  • Amazon EventBridge for event-driven and decoupled data pipelines



Vector Databases & Embeddings

  • Strong understanding of vector database concepts, indexing strategies, and performance trade-offs

  • Design and implementation of embedding generation pipelines

  • Optimization techniques for semantic search and retrieval accuracy

  • Effective chunking strategies for document ingestion and processing

  • Experience with CockroachDB deployment and management is beneficial



Document Processing

  • Experience with PDF parsing libraries such as PyPDF2, pdfplumber, and AWS Textract

  • Integration of OCR solutions (AWS Textract, Tesseract) for scanned documents

  • Extraction of document structure (headings, tables, sections)

  • Metadata extraction, normalization, and enrichment

  • Handling of multiple document formats including PDF, HTML, and DOCX



Data Integration

  • Familiarity with SAP data structures is beneficial

  • Integration with PIM (Product Information Management) systems

  • Design and consumption of REST APIs



Programming & Querying

  • Python (advanced): pandas, numpy, boto3, and data processing best practices

  • SQL (advanced): complex queries, performance tuning, and query optimization



Data Quality & Governance

  • Data profiling and ongoing quality assessment

  • Schema validation and evolution strategies

  • Data lineage tracking and observability

  • Understanding of Master Data Management (MDM) concepts



Domain Knowledge

  • Product catalog data models and hierarchies

  • E-commerce data patterns and integrations

  • B2B data exchange and system integration

To apply for this role please submit your CV or contact Dillon Blackburn on (phone number removed) or at (url removed).

Tenth Revolution Group are the go-to recruiter for Data & AI roles in the UK offering more opportunities across the country than any other recruitment agency. We're the proud sponsor and supporter of SQLBits, Power Platform World Tour, and the London Fabric User Group. We are the global leaders in Data & AI recruitment.

Job Details

Company
Tenth Revolution Group
Location
City of London, London, United Kingdom
Hybrid / Remote Options
Employment Type
Contract
Salary
£300 - £350/day
Posted