Google Product Site Reliability Engineer (Hybrid Working)
We're looking for a Google Product Site Reliability Engineer to join our Public Cloud Platform. You'll have a unique opportunity to be part of an ambitious team to strengthen observability, reliability and operation excellence across our GCP platform, with the purpose of driving our tech modernisation agenda and enable us to become the biggest Fintech in the UK.
The ideal candidate will have demonstrable experience in Cloud engineering, Observability platforms and a passion for technology. Commitment to delivering high-quality, scalable solutions is a must. You'll bring your expertise to partner closely with the product engineering teams to ensure systems are observable, reliable and operable at scale
What you'll do
Define and evolve observability standards across metrics, logs, traces and events
Partner with teams to ensure services are observable by design
Use Dynatrace as the primary observability tool to ensure effective instrumentation and coverage, meaningful dashboards and SLO based alerting aligned to user impact
Be hands-on engineering, maintaining our Infrastructure as Code and CI/CD pipeline-based product and services by responding to change, implementing enhancements & improving reliability and customer experience
Observing, investigating & fixing service issues, with an engineering attitude - resolving via code changes and implementing improvements to prevent repeat issues
Implementing further automation and reducing toil, by utilising existing Cloud tooling or implementing new technologies
Core Technical Skills
Google Cloud Platform (GCP) hands-on experience; certifications preferred
Site Reliability Engineering (SRE)
Dynatrace instrumentation, dashboards, SLO-based alerting
Terraform Infrastructure as Code (modular, maintainable)
CI/CD Jenkins/Azure/DevOps/github
Kubernetes production cluster administration
DevOps automation, toil reduction
Scripting Python, Groovy, BASH, PowerShell
Cloud Security, Networking & APIs
Incident Management & Troubleshooting