Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
Required Skills and Experience: Proven experience in cloud infrastructure automation. Deep knowledge of cloud platforms (AWS, Azure, GCP), containerization (Docker, Kubernetes, Rancher), automation tools (Terraform, Ansible), and monitoring solutions (Prometheus, Grafana). Strong scripting and programming skills in Bash, Python, and Go. Experience in DevOps, SRE, or Platform Engineering roles with a focus on hybrid infrastructure. Familiarity with Agile methodologies More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
teams to: Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and safer by More ❯
setting up and managing monitoring, metrics, and alerting systems Experience operating production-grade services at scale Great to have: Experience with tools such as: Terraform, SaltStack, MongoDB, Elasticsearch, Kafka, Prometheus, Grafana or HashiCorp Vault Experience with securing applications, services, and data, including authentication, authorization, TLS, and encryption Exposure to Kubernetes (administering, deploying, or developing apps on K8s clusters) Understanding of More ❯
alerts, and service flow mappings aligned to engineering needs. Help teams craft complex DQL queries to extract meaningful insights from telemetry data. Support observability design and migration efforts from Prometheus, Grafana, and CloudWatch to Dynatrace. Advise on RBAC models and data access strategies based on team structure and security requirements. Assist in monitoring strategy for Kubernetes-based workloads, especially in More ❯
Cambridge, Cambridgeshire, East Anglia, United Kingdom
Xcede
alerts, and service flow mappings aligned to engineering needs. Help teams craft complex DQL queries to extract meaningful insights from telemetry data. Support observability design and migration efforts from Prometheus, Grafana, and CloudWatch to Dynatrace. Advise on RBAC models and data access strategies based on team structure and security requirements. Assist in monitoring strategy for Kubernetes-based workloads, especially in More ❯