Senior Software Engineer, Site Reliability Engineering, Cloud IRT
Minimum qualifications:
- Bachelor’s degree in Computer Science, a related field, or equivalent practical experience.
- 5 years of experience with software development in one or more programming languages.
- 3 years of experience in designing, analyzing, and troubleshooting distributed systems.
- 2 years of experience leading projects and providing technical leadership.
- Experience in telemetry systems and incident and risk management.
- Ability to work on cross-organizational boundaries.
- Ability to balance product/development velocity with architectural hygiene.
- Excellent problem-solving approach and communication skills, with a passion for learning from experiences.
- Define and escalate risks in Cloud and reduce incident probabilities with strategic and tactical/pragmatic approaches as appropriate.
- Focus on high-quality customer outcomes and continuous collaboration across GCP teams.
- Create IMAG training, end to end processes for incident management lifecycle, and partner with Cloud SRE UTLs and the Cloud Support leadership team.
- Build systems and tooling to support the Cloud IRT team. Improve visibility for Cloud, detection of large-scale issues, communications to customers, stakeholders and customer facing teams.
- Participate in oncall rotation supporting critical incident response for all of GCP.