and written communication skills and are willing to present and defend your ideas to technical and non-technical audiences. Additional Desired Skills Experience with incident management platforms like PagerDuty, OpsGenie, or similar tools Understanding of SLO/SLA management and implementations Knowledge of industry standard incident management frameworks and best practices Familiarity with automated remediation and runbook automation Experience More ❯
protocols, encoding/transcoding workflows. Demonstrated ability to lead technical recovery during high-pressure incidents Familiarity with observability tools (e.g., Grafana, Prometheus, Datadog) and incident management platforms (e.g., PagerDuty, Opsgenie). Excellent communication and stakeholder management skills. Strong analytical and problem-solving abilities. What's in it For You? Hybrid Work Model: We've adopted a flexible hybrid working More ❯
dashboards and reports Maintain existing alarms and create new ones to monitor application health, mainly on AWS Integrate with third-party systems to improve monitoring and reporting Manage the OPSgenie rotation schedule(s) and participate in rotations for mission-critical systems Collaborate with IT and security network specialists for cohesive monitoring Work with development engineers to implement application service More ❯
Training and Training on AWS Well-Architected Reviews will be provided. CORE COMPETENCIES Experience with the core AWS services and cloud technology. Experience with monitoring solutions such as CloudWatch, OpsGenie or similar. Experience in maintaining cloud-native applications in an AWS environment. Demonstrable ability to rapidly learn new technologies to solve problems in new areas. DESIRABLE COMPETENCIES AWS Solutions More ❯