Site Reliability Engineer
Site Reliability Engineer
• Salary to £60k + Company Options Scheme
• Hybrid working between your home, their offices (London Vauxhall) & client sites.
NB: Please only apply if you are (ideally a UK National) able to achieve either SC or DV clearance.
(SC) 5 years & (DV) 10 years residency in the UK with no more than a 3-month break outside the UK.
Overview
This company is a Workflow & AI Orchestration Specialist. They're on a mission to modernise how public sector organisations manage casework, derive insight from data and deliver citizen services. They’re growing fast and looking for bright, dynamic people to help build their business.
Role
They’re looking for a Site Reliability Engineer (SRE) to join their growing platform and delivery teams. You’ll help design, build, and operate reliable, secure, and performant infrastructure that underpins critical public-sector services.
You’ll combine your background in DevOps, cloud engineering, and automation with a focus on reliability, observability, and scalability. You’ll also work with event-driven technologies, identity and access management, and data platforms, ensuring our orchestration solutions are resilient, secure, and future-ready.
This is an exciting opportunity for an engineer who thrives on solving complex infrastructure and automation challenges, automating everything, and shaping secure-by-design delivery environments.
Responsibilities
Design, build, and operate resilient cloud infrastructure (AWS & Azure preferred)
Develop and maintain CI/CD pipelines using tools such as GitHub Actions, Jenkins, or GitLab CI
Implement infrastructure-as-code using Terraform, Helm, and Kubernetes manifests
Build and manage event-driven architectures using Apache Kafka or similar messaging systems
Operate and maintain databases and data platforms such as PostgreSQL, Elasticsearch, and MongoDB
Configure and support identity and access management (IdAM) solutions such as Keycloak
Monitor system health, performance, and capacity using modern observability stacks (Prometheus, Grafana, ELK, OpenTelemetry)
Champion DevSecOps practices, embedding security and compliance into every stage of delivery
Automate deployment, scaling, and recovery processes to improve reliability and reduce manual toil
Support development teams with environment configuration, secrets management, and container orchestration
Participate in incident management and root-cause analysis to continuously improve system resilience
Contribute to internal platform frameworks, templates, and automation accelerators
Essential
Experience in SRE, DevOps, or Cloud Infrastructure Engineering roles
Strong knowledge of AWS, Azure, or GCP environments
Hands-on experience with Kubernetes, Docker, and container orchestration
Proven experience building and maintaining CI/CD pipelines
Strong experience with infrastructure-as-code (Terraform, Helm, or CloudFormation)
Working knowledge of event-driven systems (Kafka, RabbitMQ, or similar)
Experience managing and tuning databases (PostgreSQL, Elasticsearch, MongoDB)
Familiarity with Keycloak or equivalent identity and access management (IdAM) platforms
Strong scripting or automation experience (Python, Bash, or Go preferred)
Experience with monitoring, logging, and alerting tools for observability
Understanding of networking, DNS, load balancing, and security controls
Collaborative, pragmatic, and proactive problem solver
Must be eligible and prepared to go through SC Security Clearance
Desirable
Experience operating production workloads in the UK public sector or regulated environments
Familiarity with Camunda or other process orchestration platforms
Experience implementing DevSecOps frameworks or security automation tools
Understanding of zero-trust networking and secure-by-design principles
Familiarity with service mesh technologies (e.g. Istio, Linkerd)
Exposure to agentic AI or workflow automation environments
Holds UK security clearance (SC, DV, eDV)
What They Offer
A company option scheme gives their employees a stake in the organisation they’re building
Meaningful missions, tackling high-impact public-sector problems that improve services millions rely on.
Hybrid working, flexible working between their London office, client offices and your home location.
Outputs over optics. They focus on what gets delivered, not hours online.
Working shoulder-to-shoulder with founders. Learn from experienced practitioners in a nurturing, hands-on environment.
Shape their organisation. your fingerprints will be on their company as it grows.
Autonomy by default. They empower their people to do what’s needed.
Sociable, low-ego, dynamic engagements. No big-firm stiffness or needless bureaucracy.
Professional Development. They invest in their people through training and certifications.
Other Stuff
NB: Please only apply if you are (ideally a UK National) able to achieve either SC or DV clearance.
(SC) 5 years & (DV) 10 years residency in the UK with no more than a 3-month break outside the UK.
NB: for non-UK Citizens; we cannot accept applications from anyone requiring sponsorship (now or in the future) for UK permanent employment status. If you are utilising a work visa this must allow you to work in the UK unrestricted for at least the next 5 years.
Profile 29 recruitment keywords: Site reliability engineer SRE DevOps Cloud Infrastructure AWS Azure GCP Kubernetes Docker container orchestration CI/CD Terraform Helm CloudFormation Kafka RabbitMQ PostgreSQL Elasticsearch MongoDB Keycloak identity access management IdAM Python Bash Go DV SC Security Clearance AI artificial intelligence London hybrid work from home