Lead Platform Engineer - Kubernetes
Omnis Partners are hiring on behalf of an enterprise-focused AI consultancy helping large organisations move AI systems from prototype into safe, scalable production.
They are looking for a Lead Platform Engineer with deep, hands-on experience designing and operating cloud-native platforms at scale.
This is a hands-on role, focused on production reliability, Kubernetes, platform automation and building self-service infrastructure for engineering teams.
The role
You will help design, build and operate secure, reliable platform infrastructure for enterprise AI workloads.
The work will involve:
- Building and operating Kubernetes-based platforms
- Designing self-service infrastructure for internal engineering teams
- Improving reliability, resilience, observability and deployment safety
- Keeping platform tooling secure, current and production-ready
- Working closely with engineering teams and client stakeholders
- Making pragmatic trade-offs around reliability, cost, security and delivery speed
Technical experience
- 12+ years’ professional engineering experience
- Strong hands-on Kubernetes experience
- Terraform as the primary IaC tool
- Helm, Kustomize, Pulumi or similar
- ArgoCD / FluxCD
- Cert-Manager, ExternalDNS, Cluster Autoscaler
- Bash / shell scripting
- Python or Go for tooling and automation
- Secrets management and supply-chain security
- Cosign, SBOMs, vulnerability scanning or similar
- DNS, TLS, load balancers and service mesh concepts
- Strong production reliability and operational ownership
Useful background
- Platform engineering, SRE, DevOps or infrastructure engineering
- Experience building platforms for internal developers
- Regulated, enterprise, financial services, public sector or security-conscious environments
- Exposure to private cloud, sovereign cloud, on-prem or non-standard cloud environments would be useful
The person
You should be a senior hands-on engineer who thinks in terms of platforms, not just infrastructure tasks.
You’ll suit this role if you are pragmatic, security-aware, collaborative with product and engineering teams, and comfortable keeping infrastructure reliable while enabling teams to move quickly.
AI/ML experience is not required, but interest in the platform and reliability challenges behind production AI systems would be helpful.