Sr Data Architect
JOB DETAILS
Role Title: Senior Data Architect
Contract Duration – 6 Months
Work mode: Hybrid | WFO 2 days a week.
Location: London
Data Hands-On Architect FULL JD
• Data Products (To-Be):
Channel Ops Warehouse (~30-day high-perf layer) and Channel Analytics Lake (7+ yrs). Expose status and statements APIs with clear SLAs.
• Platform Architecture:
S3/Glue/Athena/Iceberg lakehouse, Redshift for BI/ops. QuickSight for PO/ops dashboards. Lambda/Step Functions for stream processing orchestration.
• Streaming & Ingest:
Kafka (K4/K5/Confluent) and AWS MSK/Kinesis; connectors/CDC to DW/Lake. Partitioning, retention, replay, idempotency. EventBridge for AWS-native event routing.
• Event Contracts:
Avro/Protobuf, Schema Registry, compatibility rules, versioning strategy.
• As-Is → To-Be:
Inventory APIs/File/SWIFT feeds and stores (Aurora Postgres, Kafka). Define migration waves, cutover runbooks.
• Governance & Quality:
Data-as-a-product ownership, lineage, access controls, quality rules, retention.
• Observability & FinOps:
Grafana/Prometheus/CloudWatch for TPS, success rate, lag, spend per 1M events. Runbooks + actionable alerts.
• Scale & Resilience:
Tens of millions of payments/day, multi-AZ/region patterns, pragmatic RPO/RTO.
• Security:
Data classification, KMS encryption, tokenization where needed, least-privilege IAM, immutable audit.
• Hands-on Build:
Python/Scala/SQL; Spark/Glue; Step Functions/Lambda; IaC (Terraform); CI/CD (GitLab/Jenkins); automated tests.
Must-Have Skills:
• Streaming & EDA
Kafka (Confluent) and AWS MSK/Kinesis/Kinesis Firehose; outbox, ordering, replay, exactly/at-least-once semantics. EventBridge for event routing and filtering.
• Schema Management:
Avro/Protobuf + Schema Registry (compatibility, subject strategy, evolution).
• AWS Data Stack:
S3/Glue/Athena, Redshift, Step Functions, Lambda; Iceberg-ready lakehouse patterns. Kinesis→S3→Glue streaming pipelines; Glue Streaming; DLQ patterns.
• Payments & ISO 20022:
PAIN/PACS/CAMT, lifecycle modeling, reconciliation/advices; API/File/SWIFT channel knowledge.
• Governance:
Data-mesh mindset; ownership, quality SLAs, access, retention, lineage.
• Observability & FinOps:
Build dashboards, alerts, and cost KPIs; troubleshoot lag/throughput at scale.
• Delivery:
Production code, performance profiling, code reviews, automated tests, secure by design.
• Data Architecture Fundamentals (Must-Have):
- Logical Data Modeling
Entity-relationship diagrams, normalization (1NF through Boyce-Codd/BCNF), denormalization trade-offs; identify functional dependencies and key anomalies.
- Physical Data Modeling
Table design, partitioning strategies, indexes; SCD types; dimensional vs. transactional schemas; storage patterns for OLTP vs. analytics.
- Normalization & Design
Normalize to 3NF/BCNF for OLTP; understand when to denormalize for queries; trade-offs between 3NF, Data Vault, and star schemas.
- CQRS (Command Query Responsibility Segregation)
Separate read/write models; event sourcing and state reconstruction; eventual consistency patterns; when CQRS is justified vs. overkill.
- Event-Driven Architecture (EDA)
Event-first design; aggregate boundaries and invariants; publish/subscribe patterns; saga orchestration; idempotency and at-least-once delivery.
- Bounded Contexts & Domain Modeling
Core/supporting/generic subdomains; context maps (anti-corruption layers, shared kernel, conformist, published language); ubiquitous language.
- Entities, Value Objects & Repositories
Domain entity identity; immutability for value objects; repository abstraction over persistence; temporal/versioned records.
- Domain Events & Contracts
Schema versioning (Avro/Protobuf); backward/forward compatibility; event replay; mapping domain events to Kafka topics and Aurora tables.
Nice-to-Have:
- QuickSight/Tableau; Redshift tuning; ksqlDB/Flink; Aurora Postgres internals.
- Edge/API constraints (Apigee/API-GW), mTLS/webhook patterns