Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience by enabling actionable monitoring and alerting. Drive cloud cost visibility and optimization efforts across engineering through dashboards, tagging standards, and automation. Partner with stakeholders to … platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incidentresponse best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire, and develop talented platform engineers with More ❯
Digital Transformation, we are investing in IT services to enhance learning and research capabilities, fostering global collaboration on pressing issues. Your role involves supporting existing services and applications, managing incident responses, and delivering sustainable services that facilitate discovery, usability, management, and preservation of Special Collections metadata and digital collections. You will also support the Library's Digital Library repository More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
and availability of our security infrastructure. What You'll Be Doing * Managing Hardware Security Modules (HSMs)and cryptographic infrastructure* Creating, storing, and retiring encryption keyssecurely across multiple platforms* Supporting incident and change managementprocesses* Collaborating with application, infrastructure, and support teams* Ensuring compliance with security standards and audit requirements* Contributing to project deliveryand continuous improvement initiatives What We're Looking … work under pressure* Excellent communication and stakeholder management skills Nice to Have * ITIL Foundation certification* Security or project management certifications* Experience with tools like JIRA, Confluence, SharePoint* Background in incident responseand risk management Benefits * Salary up to £41,000 depending on experience* Pension of 12%* Private medical* Discretionary bonus Please Note: This is a permanent role for UK residents More ❯
Macclesfield, Cheshire, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
and availability of our security infrastructure. What You'll Be Doing * Managing Hardware Security Modules (HSMs)and cryptographic infrastructure* Creating, storing, and retiring encryption keyssecurely across multiple platforms* Supporting incident and change managementprocesses* Collaborating with application, infrastructure, and support teams* Ensuring compliance with security standards and audit requirements* Contributing to project deliveryand continuous improvement initiatives What We're Looking … work under pressure* Excellent communication and stakeholder management skills Nice to Have * ITIL Foundation certification* Security or project management certifications* Experience with tools like JIRA, Confluence, SharePoint* Background in incident responseand risk management Benefits * Salary up to £41,000 depending on experience* Pension of 12%* Private medical* Discretionary bonus Please Note: This is a permanent role for UK residents More ❯
Warrington, Cheshire, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
and availability of our security infrastructure. What You'll Be Doing Managing Hardware Security Modules (HSMs)and cryptographic infrastructure Creating, storing, and retiring encryption keyssecurely across multiple platforms Supporting incident and change managementprocesses Collaborating with application, infrastructure, and support teams Ensuring compliance with security standards and audit requirements Contributing to project deliveryand continuous improvement initiatives What We're Looking … work under pressure Excellent communication and stakeholder management skills Nice to Have ITIL Foundation certification Security or project management certifications Experience with tools like JIRA, Confluence, SharePoint Background in incident responseand risk management Benefits Salary from £35-45,000 depending on experience Pension of 12% Private medical Discretionary bonus Please Note: This is a permanent role for UK residents More ❯
Google Cloud products, particularly in the context of analytics platforms or large-scale infrastructure. Strong understanding of Site Reliability Engineering (SRE) principles, including SLIs/SLOs, error budgets, and incident response. Experience with infrastructure as code (e.g., Terraform, Deployment Manager) and CI/CD pipelines. Proficiency in monitoring, logging, and observability tools (e.g., Stackdriver, Prometheus, Grafana). Proficient in … serverless architectures. Exposure to data analytics platforms or big data tools (e.g., BigQuery, Dataflow, Pub/Sub). Programming or scripting skills in Python, Go, or Bash. Experience with incident management, postmortems, and continuous improvement practices. About working for us Our ambition is to be the leading UK business for diversity, equity and inclusion supporting our customers, colleagues and More ❯