Infrastructure Engineering, AVP-2
- Hiring Organisation
- State Street
- Location
- Greater London, United Kingdom
- Employment Type
- Full Time
cluster provisioning, scaling, and recovery Observability, Monitoring & Reliability Engineering Design and maintain platform observability frameworks using: Prometheus, Grafana, Dynatrace, Elasticsearch Azure Monitor, Log Analytics OpenTelemetry (where applicable) Ensure proactive monitoring of cluster health, application performance, and infrastructure metrics Drive incident management practices , root cause analysis (RCA), and continuous reliability improvements … cloud-native architectures. Hands-on experience with DevOps Platform Tooling (i.e) ArgoCD, Terraform, Azure Devops, scripting Operational experience with observability tools Dynatrace, Prometheus, Grafana, OpenTelemetry Experience influencing or owning platform/product roadmaps in partnership with Product Management. Solid background in cloud native engineering concepts, performance optimization, security, and governance. ...