Gen AI Engineer

We're hiring a Gen AI Engineer

If you've built LLM systems that have run in production, failed in production, and you have been the one to fix them, this is worth reading.

What our client does

They're an AI company operating at the intersection of computer vision and large language models, building intelligent workflows for industries where work happens in the field: utilities, telecoms, energy, retail.

Their platform processes real-world operational data at scale, helping global enterprise clients make faster, safer and more accurate decisions about their assets and people.

They have built one of the largest datasets of real-world operational workflows in this space and have over 50 AI models running in production today.

The role

his is not a prompt engineering or prototype role.

You will own LLM systems end to end in production, including:

Building and deploying LLM applications and agent workflows
Monitoring system behaviour, performance and output quality
Debugging issues across pipelines such as retrieval, orchestration and model outputs
Tracing failures using logs, metrics and LLM call inspection
Designing evaluation frameworks to detect regressions and drift

You will be responsible not just for building systems, but for keeping them reliable under real-world conditions.

What we're looking for

Experience building and running LLM applications in production environments
Evidence of debugging real issues such as incorrect outputs, latency spikes, retrieval failures or agent misbehaviour
Experience with monitoring and observability of LLM systems, for example Langfuse, Prometheus, Grafana, OpenTelemetry or similar
Strong understanding of RAG systems, retrieval pipelines and evaluation workflows
Experience with agentic frameworks such as LangGraph, CrewAI or similar beyond basic LangChain usage
Ability to explain how you diagnose and fix issues step by step
Strong Python and experience working across application and infrastructure layers
Multimodal experience across text and image or video is beneficial

Tech stack

Python, AWS, LangGraph, LangChain, vector databases, evaluation tooling, observability platforms, Docker

Why join

Small, senior team with high ownership
Systems already in production with real customers
Bi-weekly shipping cycles with fast feedback loops
Remote-first with optional London office and monthly meetups
Equity, healthcare allowance, pension

You will be working on systems where failures matter and fixing them is part of the job.

Apply Now

Gen AI Engineer

Job Details