systems, monitoring tools (e.g., Prometheus, Grafana), and caching (e.g., Redis). Strong communication and cross-functional collaboration skills. Desirable Skills Familiarity with SRE principles and incident management tools (e.g., PagerDuty). Understanding of DevOps security practices and FinOps. Experience with MACH architecture, CDN optimization, and service mesh/API gateways. If you're interested please get in touch ASAP More ❯
successful in this role, you should have: Experience in architecture and engineering of Event Intelligence Solutions/AIOps platforms. Experience engineering monitoring platforms such as IBM Netcool, Moogsoft, BigPanda, PagerDuty, ServiceNow AIOps. Proficiency in Python, and hands-on knowledge of Ansible Automation Platform. Other highly valued skills include: Knowledge of Observability Platforms: Prometheus, Grafana, ELK, Splunk. Experience with integration into More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Salt Search
coverage planning to ensure 24/7 coverage. Act as a back-up during unforeseen coverage gaps Process Oversight: Maintain and improve escalation workflows, including use of tools like PagerDuty and Salesforce. Ensure compliance with SLAs and contractual obligations Client Assurance: Engage in senior-level conversations with clients to reassure them during service disruptions, incidents and bugs. Incident Management: Coordinate … communication skills, with the ability to manage high-pressure situations and make decisions. Strong client relations skills, can-do attitude Familiarity with support platforms (e.g., Salesforce Case Management, ServiceNow,PagerDuty). Experience with business continuity practices and procedures. Ability to analyse data and trends to inform decision-making. Comfortable working out-of-hours and managing support coverage across geographies. Self More ❯