large directional technical decisions (ex. Deciding which technology, or pattern to create or leverage) Experience being "on-call" for a service, and familiarity with incident notification tooling (ex. Pagerduty, Opsgenie) Comprehensive understanding of SRE principles (ex. Working knowledge of the Google SRE book) Demonstrated strength in leading a project in a agile/scrum environment Thrives in a diverse More ❯
including out-of-hours support. Strong understanding of ITIL v3 or v4 frameworks and service management best practices. Experience working with monitoring, ticketing, and ITSM tools (e.g., ServiceNow, Jira, Opsgenie). Excellent communication, stakeholder management, and analytical skills. Ability to lead cross-functional teams during incidents and post-mortem activities. The successful candidate will also participate in an on More ❯
and written communication skills and are willing to present and defend your ideas to technical and non-technical audiences. Additional Desired Skills Experience with incident management platforms like PagerDuty, OpsGenie, or similar tools Understanding of SLO/SLA management and implementations Knowledge of industry standard incident management frameworks and best practices Familiarity with automated remediation and runbook automation Experience More ❯
protocols, encoding/transcoding workflows. Demonstrated ability to lead technical recovery during high-pressure incidents Familiarity with observability tools (e.g., Grafana, Prometheus, Datadog) and incident management platforms (e.g., PagerDuty, Opsgenie). Excellent communication and stakeholder management skills. Strong analytical and problem-solving abilities. What's in it For You? Hybrid Work Model: We've adopted a flexible hybrid working More ❯
dashboards and reports Maintain existing alarms and create new ones to monitor application health, mainly on AWS Integrate with third-party systems to improve monitoring and reporting Manage the OPSgenie rotation schedule(s) and participate in rotations for mission-critical systems Collaborate with IT and security network specialists for cohesive monitoring Work with development engineers to implement application service More ❯
Training and Training on AWS Well-Architected Reviews will be provided. CORE COMPETENCIES Experience with the core AWS services and cloud technology. Experience with monitoring solutions such as CloudWatch, OpsGenie or similar. Experience in maintaining cloud-native applications in an AWS environment. Demonstrable ability to rapidly learn new technologies to solve problems in new areas. DESIRABLE COMPETENCIES AWS Solutions More ❯