Platform Engineer
- Hiring Organisation
- SoCode Recruitment
- Location
- Cambridge, Cambridgeshire, UK
canary releases with automated rollback and recovery mechanisms • Lead initiatives to improve platform reliability by removing single points of failure and enhancing autoscaling, high availability and managed service usage • Collaborate with SRE, Security and Engineering teams to strengthen observability, monitoring and alerting using Prometheus, Grafana and CloudWatch • Work … networking and cloud security principles • Familiarity with observability tools such as Prometheus, Grafana and Loki along with structured alerting practices • Experience with database migrations, high availability configurations, backups and disaster recovery • Strong scripting and automation skills using Terraform, Python, Bash or similar languages • Excellent communication and collaboration skills ...