Site Reliability Engineer
Cambridge, Cambridgeshire, United Kingdom
Hybrid / WFH Options
Hybrid / WFH Options
AI Tech Suite
the edge. Proficiency in Python, Docker, Linux systems, and scripting (Bash, Python). Strong expertise with infrastructure automation tools (Terraform, Ansible). Experience managing observability and monitoring systems, particularly Prometheus. Deep understanding of networking concepts and protocols. Responsibilities: Design, build, and maintain scalable and resilient infrastructure on the edge. Develop … as-code solutions using Terraform, Ansible, and scripting languages (Python, Bash). Deploy and manage containerized applications using Docker and related technologies. Ensure system observability by building and optimizing monitoring systems, particularly using Prometheus. Troubleshoot and optimize Linux-based systems (e.g., Red Hat, CentOS, Ubuntu). xAI's Grok is … technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting metrics. Building easily parsable dashboards. Designing and overseeing our on-call rotations. Improving our deployment process to increase reliability. Luminance is More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted: