LLM Architect
LLM Architect
Edinburgh (on-site)
£100k-120k + exceptional benefits A rare chance to drive the future of AI infrastructure at one of the world's leading R&D tech organisations.
This is a senior opportunity with a global research leader, where you ll architect and optimise the platforms that deliver large-scale language models to production. You ll be working on some of the hardest challenges in distributed AI systems: building ultra-reliable, ultra-scalable environments for inference and deployment. What you ll be doing
Edinburgh (on-site)
£100k-120k + exceptional benefits A rare chance to drive the future of AI infrastructure at one of the world's leading R&D tech organisations.
This is a senior opportunity with a global research leader, where you ll architect and optimise the platforms that deliver large-scale language models to production. You ll be working on some of the hardest challenges in distributed AI systems: building ultra-reliable, ultra-scalable environments for inference and deployment. What you ll be doing
- Designing cloud-native architectures to run large language models on serverless frameworks (e.g. Kubernetes, Knative, or custom-built FaaS).
- Developing approaches to minimise cold-start latency through advanced container snapshotting, weight pre-loading, and graph partitioning.
- Building distributed inference pipelines with tensor parallelism, model sharding, and efficient memory scheduling to serve LLMs at scale.
- Experimenting with quantisation, pruning, and KV-cache management to squeeze maximum throughput from GPU/accelerator clusters.
- Working closely with applied researchers to turn state-of-the-art methods into robust, production-grade systems.
- Deep understanding of large-scale ML systems engineering, with direct experience in deploying or optimising LLMs.
- Hands-on expertise in C++/Rust/Go for systems programming, plus Python for model integration.
- Strong knowledge of distributed runtimes and scheduling frameworks (e.g. Ray, Dask, MPI, or custom equivalents).
- Experience with GPU cluster management (CUDA, NCCL, Triton Inference Server) and performance tuning across accelerators.
- Solid grasp of cloud-native orchestration (Docker, Kubernetes, Helm) and observability tooling (Prometheus, Grafana, Jaeger).
- Proven ability to translate cutting-edge research into engineered solutions that can scale globally.
- Influence how next-generation LLM services are built and delivered to millions of users worldwide.
- Operate at the intersection of distributed systems, high-performance computing, and AI research.
- Join a global R&D organisation with unmatched resources, where innovation isn t just encouraged it s expected.
Bright Purple is proud to be an equal opportunities employer. We partner with clients who value and actively promote diversity and inclusion across the technology sector.
- Company
- Bright Purple
- Location
- Edinburgh, Midlothian, United Kingdom EH120
- Employment Type
- Permanent
- Salary
- GBP Annual
- Posted
- Company
- Bright Purple
- Location
- Edinburgh, Midlothian, United Kingdom EH120
- Employment Type
- Permanent
- Salary
- GBP Annual
- Posted