RDS Good understanding of monitoring and logging solutions, e.g. Prometheus, AWS Cloudwatch, Grafana, OpenTelemetry, Honeycomb, ELK etc. Basic SRE knowledge, and experience in alerting and incident management platforms (eg. Opsgenie, Pagerduty) Proven ability to provide and support strong and scalable CI/CD pipelines Linux, Git, Docker and good scripting skills in e.g. Python, bash, Go. You should be More ❯
and written communication skills and are willing to present and defend your ideas to technical and non-technical audiences. Additional Desired Skills Experience with incident management platforms like PagerDuty, OpsGenie, or similar tools Understanding of SLO/SLA management and implementations Knowledge of industry standard incident management frameworks and best practices Familiarity with automated remediation and runbook automation Experience More ❯
protocols, encoding/transcoding workflows. Demonstrated ability to lead technical recovery during high-pressure incidents Familiarity with observability tools (e.g., Grafana, Prometheus, Datadog) and incident management platforms (e.g., PagerDuty, Opsgenie). Excellent communication and stakeholder management skills. Strong analytical and problem-solving abilities. What's in it For You? Hybrid Work Model: We've adopted a flexible hybrid working More ❯