Remote Site Reliability Engineer (SRE)
- Hiring Organisation
- Air Apps
- Location
- Milton Keynes, Buckinghamshire, UK
system performance, scalability, and incident response workflows to improve uptime. Work closely with development and DevOps teams to improve system design for reliability. Conduct root cause analysis (RCA) and implement preventative measures to minimize failures. Ensure high availability by designing and maintaining load balancing, failover, and disaster … Hands-on experience with containerization and orchestration (Docker, Kubernetes, Helm). Strong Linux system administration and networking fundamentals. Experience with incident management, debugging, and root cause analysis . Proficiency in scripting (Bash, Python, or Go) for automation and system monitoring . Knowledge of load balancing, failover strategies ...