Senior Azure SaaS Reliability & Support Engineer
Kingston Upon Thames, England, United Kingdom
Hybrid/Remote Options
Hybrid/Remote Options
Reveal Media
Develop self-healing routines for recurring problems. 4. Monitoring & Reporting Implement and maintain Azure Monitor/Application Insights/Log Analytics dashboards for: Environment uptime & performance SLA compliance & error budget tracking Incident trends and recurring issue analysis Provide regular reliability reports and improvement recommendations to stakeholders. 5. Continuous Improvement & Knowledge Sharing Feed recurring issues and systemic risks into the … to post-incident reviews with actionable follow-ups. Maintain troubleshooting guides and technical runbooks for common issues. Success Measures (KPIs) Uptime: ≥ target SLO % for ST and MT environments. Error Budget Burn Rate: Maintain within agreed thresholds. Incident Metrics: Reduce MTTR for P1/P2 incidents. Reduce recurrence rate of common issues. Automation Impact: Number of recurring issues automated/… and automation skills in PowerShell and/or C#. Understanding of Azure components: App Services, VMs, SQL DB, Blob Storage, scaling strategies. Experience in capacity planning, SLOs, and error budget management Azure Monitor, Application Insights, Log Analytics, Azure Data Explorer (KQL), Azure Functions, Logic Apps, PowerShell, C#, SQL Server Management Studio, Azure Storage Explorer, Power BI (for More ❯
Posted: