define implement and improve business performance SLO's. 2+ years of experience with Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) and retrospective analysis. 2+ or more years in hands-on technical roles (such as site reliability engineer, software More ❯
frontend, backend, and APIs. There's also a strong DevOps and observability culture, so you'll get stuck into tooling like Dynatrace, Splunk, and OpsGenie, and help improve reliability and performance from the ground up. This is a role for someone who wants to own the quality space and More ❯
across departments Comfortable collaborating with global teams, including across US and EMEA time zones Preferred (Bonus) Skills Hands-on experience with tools like PagerDuty, OpsGenie, ServiceNow, CloudWatch, Chronosphere, or similar Understanding of SLA/SLO implementation and performance tracking Exposure to incident management frameworks, automated remediation, and runbook automation More ❯