SRE - Site Reliability Engineer

Senior Site Reliability Engineer (Observability)

Location: London/UK (Remote)

Contract: 12 Months Initial

Day rate: £55 Per Hour - £62 Per Hour Inside IR35

Job Overview

We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services.

Responsibilities

  • Design, deploy and scale observability platforms
  • Manage and scale Prometheus monitoring systems
  • Deploy and maintain large Elasticsearch clusters
  • Build and maintain data pipelines using Kafka
  • Develop alerting and monitoring frameworks
  • Automate infrastructure using Terraform and Ansible
  • Develop tools and scripts using Python, Go, Ruby or Bash
  • Work with Linux systems (Debian/Ubuntu)
  • Participate in on-call rotation
  • Improve system reliability, performance and scalability

Required Skills

  • 5+ years experience in Site Reliability Engineering/DevOps
  • Strong Linux systems experience
  • Observability and Monitoring tools experience
  • Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana)
  • Kafka
  • Terraform/Infrastructure as Code
  • Ansible/Configuration Management
  • Programming experience (Python, Go, Ruby or Bash)
  • Distributed systems and cloud infrastructure experience

This is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV

Randstad Technologies is acting as an Employment Business in relation to this vacancy.

Job Details

Company
Randstad Technologies
Location
London, United Kingdom
Employment Type
Contract
Salary
GBP 55 - 62 Hourly
Posted