Site Reliability Engineer (Ilkley)
Ilkley, West Yorkshire, UK
SmartSearch
will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure and applications. This role focuses on maintaining and improving system observability, automating operations, and enhancing deployment practices to support business-critical services. Reporting directly to the Lead Site Reliability Engineer, you will be expected to work … Ilkley office for occasional office attendance. VARIED DAY TO DAY RESPONSIBILITIES Ensuring system reliability, performance, and scalability through monitoring and automation Building and maintaining observability solutions using Grafana, Prometheus, Loki, OpenTelemetry Proactively identifying and resolving performance bottlenecks and infrastructure issues Automating infrastructure provisioning, configuration management, and deployments Implementing effective logging … FOR IN A CANDIDATE? Experience with SRE principles, such as incident management, error budgets, and service-level objectives (SLOs) Experience designing and implementing robust observability, monitoring and logging solutions Strong proficiency with observability and monitoring tools such as Grafana, Prometheus, and Loki Strong experience with distributed tracing and telemetry tools More ❯
Posted: