Gen AI Engineer

We're hiring a Gen AI Engineer

If you've built LLM systems that have run in production, failed in production, and you have been the one to fix them, this is worth reading.

What our client does

They're an AI company operating at the intersection of computer vision and large language models, building intelligent workflows for industries where work happens in the field: utilities, telecoms, energy, retail.

Their platform processes real-world operational data at scale, helping global enterprise clients make faster, safer and more accurate decisions about their assets and people.

They have built one of the largest datasets of real-world operational workflows in this space and have over 50 AI models running in production today.

The role

his is not a prompt engineering or prototype role.

You will own LLM systems end to end in production, including:

  • Building and deploying LLM applications and agent workflows
  • Monitoring system behaviour, performance and output quality
  • Debugging issues across pipelines such as retrieval, orchestration and model outputs
  • Tracing failures using logs, metrics and LLM call inspection
  • Designing evaluation frameworks to detect regressions and drift

You will be responsible not just for building systems, but for keeping them reliable under real-world conditions.

What we're looking for

  • Experience building and running LLM applications in production environments
  • Evidence of debugging real issues such as incorrect outputs, latency spikes, retrieval failures or agent misbehaviour
  • Experience with monitoring and observability of LLM systems, for example Langfuse, Prometheus, Grafana, OpenTelemetry or similar
  • Strong understanding of RAG systems, retrieval pipelines and evaluation workflows
  • Experience with agentic frameworks such as LangGraph, CrewAI or similar beyond basic LangChain usage
  • Ability to explain how you diagnose and fix issues step by step
  • Strong Python and experience working across application and infrastructure layers
  • Multimodal experience across text and image or video is beneficial

Tech stack

Python, AWS, LangGraph, LangChain, vector databases, evaluation tooling, observability platforms, Docker

Why join

  • Small, senior team with high ownership
  • Systems already in production with real customers
  • Bi-weekly shipping cycles with fast feedback loops
  • Remote-first with optional London office and monthly meetups
  • Equity, healthcare allowance, pension

You will be working on systems where failures matter and fixing them is part of the job.

Job Details

Company
Wave Group
Location
England, United Kingdom
Hybrid / Remote Options
Posted