Engineering Manager, MLOps, Marketplace, Ecommerce, | 35 Million Users | UK Remote OR London, Hybrid, 1 Day PW, Up to £140,000
- Hiring Organisation
- Owen Thomas | Pending B CorpTM
- Location
- Manchester, UK
- Employment Type
- Full-time
incident management. Partner with data science, ML, and product teams to ensure infrastructure supports innovation and business needs. Oversee system reliability, cost optimisation, and vendor relationships to keep infrastructure scalable and efficient. Take ownership of critical ML/infra incidents, ensuring swift resolution and continuous learning. Deliver … Spark, Ray, TensorFlow Distributed, PyTorch Distributed). Strong understanding of monitoring, logging, and observability for large-scale ML systems. Experience in cost optimisation for compute/GPU workloads. Excellent people leadership and communication skills, able to influence technical and non-technical stakeholders. Comfortable working in a fast ...