and activity. Knowledge of distributed computing and cloud-native applications, including proficiency in AWS, Terraform, ELK stack (including monitoring tools as mentioned), PagerDuty/OpsGenie or similar, and Jenkins. NON-TECHNICAL REQUIREMENTS: Awareness of Site Reliability Engineering (SRE) principles, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and more »
and high service availability, able to define, implement and improve business performance SLOs. Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) Maintain existing compliance and governance standards established in the business Key Experience: Deep understanding of SRE ethos and more »
high service availability, able to define, implement and improve business performance SLO s. Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) Maintain existing compliance and governance standards established in the business Key Experience: Deep understanding of Google Cloud (GCP more »