Incident Management Jobs in Ilkley

1 of 1 Incident Management Jobs in Ilkley

Site Reliability Engineer

Ilkley, West Yorkshire, UK
SmartSearch
automation Building and maintaining observability solutions using Grafana, Prometheus, Loki, OpenTelemetry Proactively identifying and resolving performance bottlenecks and infrastructure issues Automating infrastructure provisioning, configuration management, and deployments Implementing effective logging, monitoring, and alerting strategies Managing incident response and post-mortem processes to improve system resilience Implementing high-availability … cloud architectural decisions Continuously improving infrastructure reliability and operational efficiency WHAT ARE WE LOOKING FOR IN A CANDIDATE? Experience with SRE principles, such as incident management, error budgets, and service-level objectives (SLOs) Experience designing and implementing robust observability, monitoring and logging solutions Strong proficiency with observability and … cloud-native applications in production environments Proficiency in capacity planning and performance optimization Experience in managing and improving CI/CD pipelines Knowledge of incident response best practices and on-call operations WHAT IS LIFE LIKE AT SMARTSEARCH? We are a multi-award winning Tech company with an aspirational More ❯
Posted:

Site Reliability Engineer

ilkley, yorkshire and the humber, United Kingdom
SmartSearch
automation Building and maintaining observability solutions using Grafana, Prometheus, Loki, OpenTelemetry Proactively identifying and resolving performance bottlenecks and infrastructure issues Automating infrastructure provisioning, configuration management, and deployments Implementing effective logging, monitoring, and alerting strategies Managing incident response and post-mortem processes to improve system resilience Implementing high-availability … cloud architectural decisions Continuously improving infrastructure reliability and operational efficiency WHAT ARE WE LOOKING FOR IN A CANDIDATE? Experience with SRE principles, such as incident management, error budgets, and service-level objectives (SLOs) Experience designing and implementing robust observability, monitoring and logging solutions Strong proficiency with observability and … cloud-native applications in production environments Proficiency in capacity planning and performance optimization Experience in managing and improving CI/CD pipelines Knowledge of incident response best practices and on-call operations WHAT IS LIFE LIKE AT SMARTSEARCH? We are a multi-award winning Tech company with an aspirational More ❯
Posted: