Senior Applied AI Engineer
Role: Senior AI Engineer
Location: Remote (UK-based)
Type: Full-time, permanent
Compensation: Competitive salary + equity available
About Veridox
Veridox detects fraud in documents and images submitted to insurance companies, airlines, and financial institutions. When someone submits a receipt, ID document, or damage photograph, our platform determines whether it's genuine, tampered with, or AI-generated and explains why.
Under the hood, we’re replacing brittle, hand‐wired pipelines with multi‐agent AI that reasons more like a human forensic analyst: pulling signals from multiple tools, weighing messy, conflicting clues, and landing on clear, defensible decisions.
The Role
You'll work on the core AI systems that make fraud decisions. This means building and improving the agent pipelines that coordinate LLMs, VLMs, OCR and forensic tools. You won’t be just calling APIs, but designing how autonomous agents reason about complex, ambiguous evidence.
Day to day, you might be:
- Improving how the system handles a document type it's getting wrong
- Designing a new agent reasoning pattern (e.g., cross-referencing findings across multiple documents in a claim)
- Debugging why an agent produces inconsistent verdicts on edge cases
- Deploying pipeline changes to AWS and monitoring real-world performance
- Writing tests that verify an agent makes the right routing decision when evidence conflicts
The work sits at the intersection of AI engineering, systems design, and a surprisingly interesting fraud detection domain.
Builder Mindset
We’re looking for people who build systems, not just talk about them.
This role is about shipping reliable AI into production: debugging edge cases, improving pipelines, and learning from real-world behaviour. If you enjoy turning ideas into working software and iterating quickly, you’ll fit right in.
What We're Looking For
You should be strong in:
- Python: you write clean, typed code and care about it working correctly at the edges
- Working with LLMs beyond chat: structured output, multi-step reasoning, prompt calibration for reliability
- AWS: comfortable with S3, Lambda, IAM, and deploying services that need to work without you watching them
- LLM/VLM Observability (e.g Arize Phoenix)
- Testing code that involves non-deterministic systems
Ideally you also bring:
- Experience with agent orchestration frameworks (LangGraph, LangChain, or similar)
- Familiarity with vision-language models for image analysis
- Some exposure to document analysis, OCR, or fraud detection
- Opinions about when agents are useful and when they're over-engineering
We don't expect you to know all of:
- The specific AWS services we use (Bedrock AgentCore, EventBridge)
- Our domain (insurance fraud, document forensics)
- The exact frameworks in our stack
These are learnable. What matters more is that you can reason about systems, write reliable code, and think critically about how AI makes decisions.
How We Work
- Small team, high autonomy, async-first
- Security-conscious by default. We assume all inputs are hostile
- We'd rather halt and ask a question than ship something we're uncertain about
- Code gets attacked before it ships. We review from the perspective of a hostile auditor
- We don't over-engineer. Three similar lines of code is better than a premature abstraction
What Growth Looks Like
Early on: You're deploying changes to the pipeline, understanding how evidence flows through agents, and shipping features that touch the reasoning layer.
Later: You own major subsystems. The document analysis agent, the tool orchestration layer, or the cross-document case analysis pipeline. You're making architectural decisions about how agents reason.
Longer term: You're designing novel multi-agent patterns, integrating new forensic tools, and shaping how the platform learns from analyst feedback over time.