Senior Solutions Architect
- Hiring Organisation
- RemoteStar
- Location
- City of London, London, United Kingdom
/CD, monitoring and orchestration frameworks (e.g. Kubeflow, Flyte, MLflow). Good knowledge of Docker and Kubernetes for containerizing AI workloads. Understanding of LLM inference stacks (vLLM, llama.cpp, OpenVINO) and model deployment formats (ONNX, .safetensors, Hugging Face Model Hub). Experience in sizing GPU infrastructures for LLM model inference … training (memory, throughput, hardware categories ranging from A10 to H200). Experience in evaluating and benchmarking LLM performance (accuracy, latency, throughput). Practical skills in Python and SQL programming, as well as knowledge of ML libraries and frameworks such as PyTorch, TensorFlow and Hugging Face. Bachelor's or Master ...