Principal Engineer (Agentic AI)
About CleverChain
CleverChain is a growth-stage RegTech company, an award-winning KYB platform recognised by Chartis Research for Best Know Your Business (KYB) and by Datos Insights for Best KYC/KYB Innovation. Our cloud-based platform automates compliance processes, streamlining customer onboarding and risk monitoring for financial institutions, fintechs, payment providers, and any organisation that needs to verify and manage their customers. We're at a pivotal moment of growth and looking to make critical hires that help us get there
Role Overview
We've recently built a new platform from the ground up, a multi-tenant, multi-region distributed microservices platform that serves as the foundation for all of our products. The platform was purpose-built to remove engineering as the bottleneck in product development: it provides a workflow and agent execution engine, a provider framework, and a tooling layer that allows our product teams to ship new agents, workflows, and products in days or weeks, not months.
Because of this, the core engineering focus isn't building individual product features on repeat. It's building and maintaining the engine itself, extending its capabilities, expanding its toolsets, and ensuring the platform gives product teams and customers the flexibility they need to build on top of it without waiting for engineering cycles.
We're looking for a Principal Software Engineer to join as one of the first hires on this platform. You'll work across the full backend, from database schema design and API implementation through to workflow orchestration, AI integrations, and infrastructure. This isn't maintenance work. It's building.
The platform has genuine depth: multi-region deployments with data residency compliance, a durable workflow execution engine, autonomous AI agents, and a 3-tier database architecture with physical tenant isolation. You'll encounter hard problems regularly, and you'll have the room to solve them properly.
Critically, this is a high-traffic, low-latency, high-availability platform. We're building toward handling millions of requests per day across multiple regions, and non-functional requirements such as scalability, data storage optimisation, latency, availability, cost optimisation are first-class concerns, not afterthoughts. Every feature you build needs to be designed with these considerations from the start.
We service global financial institutions and are subject to some of the strictest regulatory requirements around data residency, data transfer, and information security. Security isn't bolted on, it's a foundational layer. Data isolation, encryption in transit and at rest, GDPR compliance, and jurisdictional data storage and processing are built into the architecture from the ground up, and you'll be expected to maintain and extend that standard in everything you build.
What You'll Do
- Build and extend the platform engine — new workflow node types, control flow improvements, human-in-the-loop infrastructure, provider capabilities, and tooling that product teams consume
- Design for scale and performance — optimise data storage patterns, manage connection pooling across a large number of tenant databases, reduce latency in high concurrency multi-tier data access and in workflow / agent orchestration, and ensure the platform handles high throughput without degradation
- Build event-driven integrations — services that subscribe to workflow and agent completion events and trigger downstream actions, extending the platform's reactive capabilities
- Integrate AI/LLM capabilities — Multi modal, tool calling / MCP, semantic routing, provider management, agent orchestration including A2A, token and cost tracking, streaming patterns
- Uphold security and compliance standards — tenant data isolation, encryption, access control patterns, credential management, and audit logging in a regulated financial services context
- Work across the data layer — schema evolution across multiple database tiers, migration tooling, tenant provisioning, query optimisation, and caching strategies
- Contribute to infrastructure — Terraform modules, Docker builds, CI/CD pipelines, deployment automation, monitoring and observability
- Maintain platform quality — comprehensive testing, strict TypeScript, shared library architecture, structured error handling, performance benchmarking
Who You Are
- End-to-end owner — you take problems from understanding through to deployment and monitoring. You don't wait to be told what to do next
- Platform thinker — you understand the difference between building a feature and building a capability. You design with product teams and customers in mind
- Performance-minded — you instinctively consider how things behave under load. It's part of how you think, not a box you tick before release
- Security-conscious — you treat security as a responsibility, not a checklist. You care about getting it right because the data matters
- Strong communicator — remote work demands clear writing. You're effective in PRs, documentation, and async conversations
- Pragmatic about quality — you care about code quality as a tool for moving faster, not as an abstract ideal. You write tests because they help, not because someone told you to
- Collaborative and open — you bring strong ideas to the table and you're happy to debate them, but you're equally comfortable being convinced otherwise
Skills & Experience
Must Have
- Strong TypeScript / Node.js — you write strict, type-safe code and are comfortable with generics, decorators, and advanced patterns
- Backend framework experience — Express, NestJS, Fastify or similar structured MVC or modular frameworks with dependency injection, middleware, and guard patterns
- Relational database proficiency — schema design, migrations, indexing, query optimisation, transactions, and connection management at scale
- Containerisation — building, running, and debugging containerised services in local and production environments
- API design — RESTful services with proper authentication, authorisation, pagination, error handling, and rate limiting
- Performance and scalability — you've built systems that handle significant traffic. Caching strategies, connection pooling, query optimisation, and low-latency design are second nature
- Security fundamentals — encryption, secrets management, data access boundaries, audit logging. Experience in regulated industries is a strong plus
- Distributed systems experience — services communicating across network boundaries, consistency trade-offs, multi-region deployment challenges
- Durable workflow / orchestration engines — experience with systems that guarantee execution completion across failures and long-running processes (e.g .DBOS, Restate). Experience with Temporal is a strong plus.
- Infrastructure-as-code — managing infrastructure declaratively with version-controlled tooling
- Testing — unit and integration testing, mocking strategies, high coverage targets
- CI/CD and Git — automated pipelines, branching strategies, PR-based workflows
- Event-driven patterns — pub/sub, message queues, event subscriptions, async processing
- Cloud infrastructure — deploying and managing services, databases, caching layers, and storage on a major cloud provider
- Multi-tenant architecture — database-per-tenant patterns, tenant isolation, data residency compliance
- AI/LLM integration — working with LLM provider APIs, function/tool calling, agent patterns, prompt engineering
- ORM tooling — type-safe database access with migration management
- Monorepo management — build orchestration, affected detection, shared library architecture
Nice to Have
- NoSQL databases — single-table design patterns, global replication
- RegTech, FinTech, or compliance domain experience
- Enterprise authentication providers — SSO, OIDC, SAML, directory sync, JWKS validation
- Real-time streaming — server-push patterns for live data delivery
- Workflow graph interpretation — topological sorting, branching logic, loop execution
- DSL Based Workflow Orchestration - Working with YAML/JSON-based domain-specific language allows the definition of workflows declaratively.
What We Offer
- Shape Our Future, Together: You'll be one of the earliest engineering hires. You'll have a direct hand in shaping not just the product, but how we build software and what kind of engineering team we become.
- Build the Engine, Not the Features: Your work empowers product teams and customers to move fast. You're building capabilities and tooling, not grinding through feature tickets. It's platform engineering with real leverage.
- Technical Depth: Multi-region infrastructure, durable workflow execution, AI agent orchestration, and a data architecture with real complexity. The problems here are interesting and the codebase is well-structured.
- Flexible Hours: We have core hours for alignment, but we trust you to manage your time. Get the work done, pick the schedule that works for you.
- Unlimited PTO: Take what you need to stay sharp and healthy. We trust you.
- Direct Impact: In a small team, your work ships and matters immediately. You'll see the results of your contributions every day.
- A Culture of Trust and Collaboration: Kanban-style workflow with minimal ceremony transitioning to full Agile / Scrum as we scale our team. Open and frequent communication, mutual respect, and a shared commitment to building something that matters.
How to Apply
Click apply and attach your CV. If you have a GitHub profile, side project, or writing that shows how you think about software, include it in your CV as we'd love to hear about it, but it's not required.
Our process is a conversation about your experience, a practical technical discussion about real problems, and a short take-home exercise that reflects the kind of work you'd actually do here.