Lead Platform Engineer

Overview-

This position will be to take ownership of the infrastructure systems that power cutting-edge machine learning evaluations. You'll have significant influence over the architecture, tools, and processes that support a growing research team focused on large-scale AI systems.

What You’ll Do-

  • Design and maintain scalable, reliable infrastructure to support large language model evaluations using Infrastructure as Code (IaC) principles
  • Evaluate, select, and integrate technologies that support deployment and scaling (e.g., determining whether to adopt container orchestration tools)
  • Develop internal tools that interact with infrastructure for scheduling jobs, managing access, storing evaluation outputs, and more
  • Partner closely with research staff to forecast and prepare for evolving system requirements, including high-performance compute environments
  • Ensure the smooth execution of evaluations across the entire stack — from cloud services and orchestration layers to application code
  • Manage cloud accounts, including provisioning resources, setting permissions, and optimizing for security and cost
  • Contribute to the design and enforcement of security protocols across the organization
  • Participate in growing and shaping a specialized infrastructure function as the team needs expand

Who We’re Looking For

  • 8+ years of Programming in Python
  • Someone with a Software Engineering background
  • Proven ability to lead infrastructure planning through to final deployment
  • 8+ years of Designing and Architecting Kubernetes in Production
  • 5+ years of AWS experience
  • Experience implementing security practices
  • Experience working with Terraform
  • Someone happy to travel to Central London 2x a week

Please apply with a detailed CV to be contacted!

Company
Signify Technology
Location
London, UK
Posted
Company
Signify Technology
Location
London, UK
Posted