Kafka Data Architect(Streaming And Payment)
We are seeking a Hands-On Data Architect to design, build, and operate a high-scale, event-driven data platform supporting payment and channel operations. This role combines strong data architecture fundamentals, deep streaming expertise, and hands-on engineering in a regulated, high-throughput environment.
You will lead the evolution from legacy data ingestion patterns to a modern AWS-based lakehouse and streaming architecture, handling tens of millions of events per day, while applying domain-driven design (DDD) and data-as-a-product principles.
This is a builder role, not a documentation-only architect position.
Key Responsibilities
Data Products & Architecture
- Design and deliver core data products including:
- Channel Operations Warehouse (high-performance, ~30 days retention)
- Channel Analytics Lake (long-term retention, 7+ years)
- Define and expose data APIs and status/statement services with clear SLAs.
- Architect an AWS lakehouse using S3, Glue, Athena, Iceberg, with Redshift for BI and operational analytics.
- Enable dashboards and reporting using Amazon QuickSight (or equivalent BI tools).
Streaming & Event-Driven Architecture
- Design and implement real-time streaming pipelines using:
- Kafka (Confluent or AWS MSK)
- AWS Kinesis / Kinesis Firehose
- EventBridge for AWS-native event routing
- Define patterns for:
- Ordering, replay, retention, and idempotency
- At-least-once and exactly-once processing
- Dead-letter queues (DLQs) and failure recovery
- Implement CDC pipelines from Aurora PostgreSQL into Kafka and the lakehouse.
Event Contracts & Schema Management
- Define and govern event contracts using Avro or Protobuf.
- Manage schema evolution through Schema Registry, including:
- Compatibility rules
- Versioning strategies
- Backward and forward compatibility
- Align domain events with Kafka topics and analytical storage models.
Migration & Modernization
- Assess existing "as-is" ingestion mechanisms (APIs, files, SWIFT feeds, Kafka, relational stores).
- Design and execute migration waves, cutover strategies, and rollback runbooks.
- Ensure minimal disruption during platform transitions.
Governance, Quality & Security
- Apply data-as-a-product and data mesh principles:
- Clear ownership
- Quality SLAs
- Access controls
- Retention and lineage
- Implement security best practices:
- Data classification
- KMS-based encryption
- Tokenization where required
- Least-privilege IAM
- Immutable audit logging
Observability, Reliability & FinOps
- Build observability for streaming and data platforms using:
- CloudWatch, Prometheus, Grafana
- Track operational KPIs:
- Throughput (TPS)
- Processing lag
- Success/error rates
- Cost per million events
- Define actionable alerts, dashboards, and operational runbooks.
- Design for high availability with multi-AZ / multi-region patterns, meeting defined RPO/RTO targets.
Hands-On Engineering
- Write and review production-grade code using:
- Python, Scala, SQL
- Spark / AWS Glue
- AWS Lambda & Step Functions
- Build infrastructure using Terraform (IaC).
- Implement CI/CD pipelines (GitLab, Jenkins).
- Enforce automated testing, performance profiling, and secure coding practices.
Required Skills & Experience
Streaming & Event-Driven Systems
- Strong experience with Kafka (Confluent) and/or AWS MSK
- Experience with AWS Kinesis / Firehose
- Deep understanding of:
- Event ordering and replay
- Delivery semantics
- Outbox and CDC patterns
- Practical experience using EventBridge for event routing and filtering
AWS Data Platform
- Hands-on experience with:
- S3, Glue, Athena
- Redshift
- Step Functions and Lambda
- Familiarity with Iceberg-based lakehouse architectures
- Experience building streaming pipelines into S3 and Glue
Payments & Financial Messaging
- Experience with payments data and flows
- Knowledge of ISO 20022 messages:
- PAIN, PACS, CAMT
- Understanding of payment lifecycle, reconciliation, and statements
- Exposure to API, file-based, and SWIFT-based integration channels
Data Architecture Fundamentals (Must-Have)
- Logical data modeling (ER diagrams, normalization up to 3NF/BCNF)
- Physical data modeling:
- Partitioning strategies
- Indexing
- SCD types
- Strong understanding of:
- Transactional vs analytical schemas
- Star schema, Data Vault, and 3NF trade-offs
- Practical experience with:
- CQRS and event sourcing
- Event-driven architecture
- Domain-driven design (bounded contexts, aggregates, domain events)