Research Engineer (Inference)
- Hiring Organisation
- Axiōma Search
- Location
- City of London, London, United Kingdom
Research Engineer (Inference) About Serving a multimodal agent model in production is a different problem to serving a standard LLM. Context length, tool calls, and computer-use workloads create constraints that require co-designing the inference stack with the model team - not just bolting … after the fact. This is a VC-backed challenger lab building state-of-the-art computer-use agents. The inference team owns the full stack from engine layer (vLLM, SGLang) through to serving architecture (disaggregated inference, intelligent routing). The team operates at the intersection ...