Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
Maxwell Bond
provision Azure resources such as VMs, SQL Databases, Storage, and Application Gateways. Maintain and monitor infrastructure using Azure Monitor, Log Analytics, and Network Watcher. Perform regular patching, updates, and incidentresponse across cloud-based environments. Implement RBAC, Azure AD role management, and enforce security compliance via Azure Policy and Defender for Cloud. Participate in migrations from on-prem More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Pontoon
resolutions are within SLA. Build and nurture strong relationships both internally and externally to enhance service delivery for our customers. Complete and document Root Cause Analyses (RCAs) and Post Incident Reviews (PIRs), recommending improvements where necessary. Contribute to ITSM-driven initiatives, collaborating as a chapter to implement positive changes. Create and maintain Knowledge Base articles for team sustainability and … API testing tools Experience in unit testing with a focus on continual improvement in API monitoring and performance A mindset geared towards optimisation and automation, especially in alerting and incidentresponse processes Strong documentation skills to ensure key processes and learnings are shared across the team Solid understanding of ITIL v4 (certification required) Exposure to Agile methodologies A More ❯
both written and spoken Demonstrable experience as a Security Architect or similar role Strong knowledge of security standards, protocols, and best practices Experience with threat modelling, risk assessment, and incidentresponse Familiarity with security tools (e.g., Snyk, OWASP ZAP) Excellent communication and collaboration skills Self-learner and ability to execute tasks without supervision Ability to maintain the highest More ❯
track down the root cause. Communicate the impact of the problem to stakeholders in terms of business value, helping to set a priority for the resolution. Actively participate in incident responses. Engineering standards & frameworks - Maintain knowledge of Xero's current and emerging engineering standards and practices. Develop and deploy software that meets Xero's standards. Continuous improvement - Maintain knowledge More ❯
Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience by enabling actionable monitoring and alerting. Drive cloud cost visibility and optimization efforts across engineering through dashboards, tagging standards, and automation. Partner with stakeholders to … platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incidentresponse best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire, and develop talented platform engineers with More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
and availability of our security infrastructure. What You'll Be Doing * Managing Hardware Security Modules (HSMs)and cryptographic infrastructure* Creating, storing, and retiring encryption keyssecurely across multiple platforms* Supporting incident and change managementprocesses* Collaborating with application, infrastructure, and support teams* Ensuring compliance with security standards and audit requirements* Contributing to project deliveryand continuous improvement initiatives What We're Looking … work under pressure* Excellent communication and stakeholder management skills Nice to Have * ITIL Foundation certification* Security or project management certifications* Experience with tools like JIRA, Confluence, SharePoint* Background in incident responseand risk management Benefits * Salary up to £41,000 depending on experience* Pension of 12%* Private medical* Discretionary bonus Please Note: This is a permanent role for UK residents More ❯
Google Cloud products, particularly in the context of analytics platforms or large-scale infrastructure. Strong understanding of Site Reliability Engineering (SRE) principles, including SLIs/SLOs, error budgets, and incident response. Experience with infrastructure as code (e.g., Terraform, Deployment Manager) and CI/CD pipelines. Proficiency in monitoring, logging, and observability tools (e.g., Stackdriver, Prometheus, Grafana). Proficient in … serverless architectures. Exposure to data analytics platforms or big data tools (e.g., BigQuery, Dataflow, Pub/Sub). Programming or scripting skills in Python, Go, or Bash. Experience with incident management, postmortems, and continuous improvement practices. About working for us Our ambition is to be the leading UK business for diversity, equity and inclusion supporting our customers, colleagues and More ❯