Altum Consulting are working with a global entertainment company to recruit an IncidentManagement Analyst based in Central London. This role will be for an initial 12-month FTC . Applying candidates must be on a short notice period. IncidentManagement Analyst role: Incidentmanagement … documents adhere to established guidelines and best practices Reporting and data analysis Liaising with finance, cybersecurity and other teams Continuous process improvement As an IncidentManagement Analyst you'll be/have: Bachelor's Degree (Accounting/Finance/Cybersecurity/Data Science etc.) Experience in an identity More ❯
We are seeking a strategic and technically hands-on Product Quality & Support Strategist to help scale a cutting-edge Alerting & IncidentManagement Platform. This is a pivotal role focused on improving customer experience, reducing engineering disruption, and enhancing the platform’s overall product quality and technical impact. You … or a related technical field 5+ years of hands-on software/infrastructure engineering experience, especially with DevTools or Infrastructure Engineering Strong knowledge of incidentmanagement, alert routing, on-call workflows, and incident response strategies Experience working with or supporting SaaS-based observability or incidentmanagement … Hands-on experience with tools like PagerDuty, OpsGenie, ServiceNow, CloudWatch, Chronosphere, or similar Understanding of SLA/SLO implementation and performance tracking Exposure to incidentmanagement frameworks, automated remediation, and runbook automation Background in DevOps or SRE culture and tooling Prior people leadership or mentorship experience is a More ❯
Reading, Oxfordshire, United Kingdom Hybrid / WFH Options
Thames Water Utilities Limited
you will be responsible for maintaining SecOps solutions, controls, and processes across the organisation, while mentoring and leading the SOC team to ensure effective management of OT alerts and incidents. This position requires a deep understanding of SecOps concepts, technologies, and best practices, specifically across IT and OT environments. … You will be tasked with ensuring robust incidentmanagement, proactive threat detection, and continuous improvement of our security posture. Strong communication and collaboration skills are essential as you will work closely with cross-functional teams to mitigate risks and protect Thames Water's essential services. What you'll … activities such as threat hunting to uncover vulnerabilities and ensure continuous risk reduction. • Provide tangible metrics to demonstrate risk reduction and reduced technical debt. Incident Readiness & Response: • Lead the incident triage and response process, ensuring effective management and remediation of cyber security incidents. • Improve incidentmanagementMore ❯
s in Computer Science, Engineering, Data Science, or related field 8+ years in enterprise technology, with 3-5 years in platform operations or service management Experience managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven in scaling MLOps or LLMOps practices … cross-functional teams in complex ecosystems Familiarity with monitoring tools like Prometheus, Grafana, Azure Monitor Excellent communication and stakeholder engagement skills Experience managing SLAs, incidentmanagement, continuous improvement Strategic and hands-on in fast-paced environments Professional English proficiency What will be your key responsibilities? In this role … eXperiences (MAX) Platform, ensuring high availability, scalability, and performance. LLMOps Implementation: Develop and operationalize LLMOps practices, including deployment, monitoring, versioning, and performance tuning. Service Management & Support: Establish incidentmanagement, SLAs, change management, and continuous improvement processes. Governance & Compliance: Ensure adherence to Responsible AI, data privacy, and More ❯
slough, south east england, United Kingdom Hybrid / WFH Options
Lawrence Harvey
further implement the Digital Operational Resilience Framework across the company e.g., refining and optimising existing policies, plans and procedures (in areas such as Risk Management, IncidentManagement, Business Continuity, Crisis Management, Third-Party Risk Management and Disaster Recovery), supporting the implementation of new technologies to … programme planning and overseeing the ongoing execution and reporting of testing as per the test schedule and remediation of gaps/vulnerabilities identified. Collating Management Information reporting from various business stakeholders on a quarterly basis to ensure effective reporting on resilience levels of Critical Functions to Senior Management … ensure regulatory requirements are clearly understood and documented. Preparing documentation to facilitate i) status reporting on specific projects and ii) regular reporting to Senior Management and Board of Directors at Committee Meetings. Participation in the internal/external audits and inspections as required. Attending industry events to keep abreast More ❯
reading, south east england, United Kingdom Hybrid / WFH Options
Lawrence Harvey
further implement the Digital Operational Resilience Framework across the company e.g., refining and optimising existing policies, plans and procedures (in areas such as Risk Management, IncidentManagement, Business Continuity, Crisis Management, Third-Party Risk Management and Disaster Recovery), supporting the implementation of new technologies to … programme planning and overseeing the ongoing execution and reporting of testing as per the test schedule and remediation of gaps/vulnerabilities identified. Collating Management Information reporting from various business stakeholders on a quarterly basis to ensure effective reporting on resilience levels of Critical Functions to Senior Management … ensure regulatory requirements are clearly understood and documented. Preparing documentation to facilitate i) status reporting on specific projects and ii) regular reporting to Senior Management and Board of Directors at Committee Meetings. Participation in the internal/external audits and inspections as required. Attending industry events to keep abreast More ❯
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3-5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incidentmanagement, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incidentmanagement, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incidentmanagement, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incidentmanagement, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
slough, south east england, United Kingdom Hybrid / WFH Options
Marsh McLennan
Company: Marsh Description: We have a fantastic opportunity for a talented individual to join Marsh in our Cyber Claims and IncidentManagement team in London. This is a hybrid role that has a requirement to work three days per week in the office. The role: Cyber Claims Advocate … Marsh is seeking a dedicated Cyber Claims Advocate to join our dynamic Cyber Claims and IncidentManagement Team. This is an exciting opportunity for Claims Specialists or individuals with 1-3 years of experience in Cyber Claims to enhance their career in a fast-paced environment. This role … and excess insurers, advocating for clients through submissions and escalation meetings as necessary. Respond to client and broker queries regarding policy coverage and cyber incident response pre-incident/loss. Assist in managing vendor relationships and creating visually appealing content and presentations. What you need to have: Advanced More ❯
slough, south east england, United Kingdom Hybrid / WFH Options
DGH Recruitment
law firm on a permanent basis. **Experience working for a law firm is essential** Key Responsibilities Manage the User Support team using the Service Management toolset and provides accurate reporting on performance Coaching and mentoring the team to ensure that personal development, service standards and are being consistently applied … identify problems and look to address these with the respective team Carry out all relevant technical support and maintenance activities as required by Change Management, IncidentManagement, Problem Management, and Service Request Management processes. Take accountability for service delivery performance, meeting customer expectations Liaise with … other internal support teams, internal senior management and suppliers in the day-to-day management of Incidents and Service Requests. And where appropriate initiate the escalation process for Major Incidents Take ownership of major incidents, coordinate with resolution parties, and establish effective communication between stakeholders for post-incidentMore ❯
slough, south east england, United Kingdom Hybrid / WFH Options
DGH Recruitment
law firm on a permanent basis. IT Service Delivery Lead/IT Support Lead Key Responsibilities: • Manage the User Support team using the Service Management toolset and provides accurate reporting on performance. • Coaching and mentoring the team to ensure that personal development, service standards and are being consistently applied. … identify problems and look to address these with the respective team • Carry out all relevant technical support and maintenance activities as required by Change Management, IncidentManagement, Problem Management, and Service Request Management processes. • Take accountability for service delivery performance, meeting customer expectations • Liaise with … other internal support teams, internal senior management and suppliers in the day-to-day management of Incidents and Service Requests. And where appropriate initiate the escalation process for Major Incidents. • Take ownership of major incidents, coordinate with resolution parties, and establish effective communication between stakeholders for post-incidentMore ❯
Analyse current IT operations and identify gaps and opportunities for ITSM process implementation. Design, document, and implement ITSM processes including (but not limited to) IncidentManagement, Problem Management, Change Management, Service Request Management, Knowledge Management, and Configuration Management. Collaborate with stakeholders across departments to … needs. Define KPIs and metrics for ITSM processes and set up reporting dashboards. Configure and customize ITSM tools/platforms (e.g., ServiceNow, Jira Service Management, BMC Remedy). Provide training, guidance, and documentation for internal teams. Ensure alignment with ITIL or other relevant frameworks and industry standards. Qualifications: Deep … framework and ITSM best practices. Demonstrated experience building ITSM processes from scratch. Experience with ITSM platforms such as ServiceNow, Freshservice, or similar. Excellent stakeholder management and communication skills. ITIL certification (v3 or v4) strongly preferred. Does this sound like you? Please a pply below More ❯
Reading, Oxfordshire, United Kingdom Hybrid / WFH Options
Mobile Broadband Network Limited
Incident Assurance Manager Job ID PERM002892ML Department Details The Operational Services directorate is accountable for ensuring the network sites are always accessible and available. It undertakes the operation, enablement, and management of the network infrastructure to enable EE/BT and Three to deliver their best customer experiences … at the lowest cost. Reporting to the Senior Incident Assurance Manager, this role will involve relentlessly managing the delivery of Incident Assurance services by the supplier ecosystem to achieve agreed business outcomes and performance targets set by EE/BT, Three and the MBNL AOP. This is a … minimum of two days per week in our Central Reading office. What you will do: Manage and proactively drive Service and Site Availability and Incident resolution and ticketing KPI's and quality issues against contractual obligations and industry benchmarks. Identify and contribute efforts which will improve the methodologies, processes More ❯
add to the event catalog for the relevant product or application. Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring. Perform incident response and root cause analysis for critical system failures. Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs). Collaborate with … for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incidentmanagement teams, operations … different Service Operations and Engineering teams to develop and implement permanent solutions. Monitor the effectiveness of problem resolution activities, provide regular reports on problem management activities, and ensure continuous improvement. Event Management Define and maintain an event catalog, specifying active events, thresholds, and relevant remediation, and optimize it More ❯
Energy/Utilities industry experience is must* Key Responsibilities Service Delivery Management: · Oversee the performance of IT services, ensuring they meet agreed service levels (SLAs) and key performance indicators (KPIs) · Ensure all services are delivered on time, securely, and where appropriate, within the associated commercial and contractual obligations. · Manage … with third-party service providers and internal stakeholders to ensure effective service delivery as per agreed SLAs/OLAs · Responsible for overseeing the knowledge management process and related activities, including the capturing, sharing and accessibility of knowledge articles within ServiceNow SIAM-based Supplier Coordination: · Implement and manage the SIAM … agreements. · Act as the primary point of contact between Client internal teams and external service providers for IT service related issues and escalations. Major IncidentManagement (MIM): · Accountable for effective management of the major incidentmanagement process ensuring that all major incidents are resolved in More ❯
to safeguard critical business operations by design and default. You will be responsible for security automation, CI/CD pipeline enhancements , and cloud security management , ensuring compliance with industry standards. Key Responsibilities Security & DevOps Integration: Support and extend the secured CI/CD pipeline to enhance development security. Work … secure AWS cloud infrastructure for clients and internal operations. Automate AWS infrastructure builds following CIS hardening standards . Ensure top-tier security configuration, access management, and incident response on cloud platforms. Operational Support & Incident Response: Support business-critical Windows and Linux-based environments. Monitor and respond to … security alerts across Infosec, servers, firewalls, and applications. Conduct continuous monitoring of internal and third-party information security controls. Threat & Vulnerability Management: Assess SAST (Static Application Security Testing) and DAST (Dynamic Application Security Testing) scans. Implement remediation and mitigation strategies in collaboration with development teams. Maintain network security protocols More ❯
Developer - London Hybrid - £60,000 I am looking for a Senior Power Platform Developer to join my client, an independent global insurance and investment management company, based in London. This company provide a wide range of managed services to mutuals across different verticals. Services include claims, underwriting and risk … management and compliance, and more. This role offers the opportunity to work with cutting-edge technologies, designing and delivering innovative solutions that drive business efficiency and digital transformation in a dynamic and fast-paced industry. The ideal candidate will bring expertise in the Microsoft Power Platform suite and a … Power BI, and Dynamics 365, to deliver robust and innovative business applications. Strong knowledge of Microsoft Dataverse, with a particular focus on efficient data management practices, scalable solutions, and ensuring data security across various applications. Extensive experience in setting up and managing CI/CD pipelines for Power Platform More ❯
IN2-SaaS | International Software-as-a-Service Recruiters
CI/CD, and automation while guiding the Cloud Infrastructure & DevOps teams. The ideal candidate brings extensive knowledge in cloud infrastructure, DevOps, and project management, as well as a proven ability to manage and mentor high-performing teams. Key Responsibilities DevOps Strategy : Define and implement a cohesive DevOps vision … and ensure compliance with security standards. Collaboration : Work closely with software engineering, QA, and product teams to streamline workflows and enhance software quality. Cost Management : Optimize cloud costs and work with finance to manage budgets. IncidentManagement : Ensure effective monitoring, incidentmanagement, and root cause More ❯
IN2-SaaS | International Software-as-a-Service Recruiters
CI/CD, and automation while guiding the Cloud Infrastructure & DevOps teams. The ideal candidate brings extensive knowledge in cloud infrastructure, DevOps, and project management, as well as a proven ability to manage and mentor high-performing teams. Key Responsibilities DevOps Strategy : Define and implement a cohesive DevOps vision … and ensure compliance with security standards. Collaboration : Work closely with software engineering, QA, and product teams to streamline workflows and enhance software quality. Cost Management : Optimize cloud costs and work with finance to manage budgets. IncidentManagement : Ensure effective monitoring, incidentmanagement, and root cause More ❯
operated and maintained safely, within SLA, and with minimal risk. As the main point of contact, the Lead Engineer undertakes the day-to-day management of all activities of the Technical Operations Site Team, ensuring that all reactive and planned maintenance tasks are delivered to agreed timelines and SLAs … maintenance tasks on building services infrastructure, ensuring compliance with CMMS systems and operational procedures. Monitor SLA compliance, communicate effectively with client representatives, and facilitate incidentmanagement processes. Provide support to Data Center and Site Management teams, including communication assistance, technical information provision, and holiday/sickness coverage … to building services infrastructure issues, collaborating with Site Engineers, and escalating to third-party specialist contractors where specialist knowledge and skills are required Coordinate incidentmanagement, maintaining & issuing incident reports using CMMS systems. Operate within defined Service Operations processes and work within site-specific standard operating procedures. More ❯
this position, you will support our team while driving change and developing new ideas that will shape the future of our operations. Role Scope: Management Structure & People Management • Serves as second port of call when IBX manager or supervisor is unavailable. • Supports IBX manager and site supervisors in … the CFE supervisor/manager. • Attends and contributes to Customer MBRs and any key customer meetings. • Acts as an escalation point during major incidents. IncidentManagement • Provides technical support during incidents - directs team to failed components and guides response. • Supports the incidentmanagement process, collaborating closely … act as a technical lead during incidents, confidently directing teams and troubleshooting complex issues. • Strong organisational skills and experience in planning projects, including vendor management and customer communications. • Collaborative and approachable - able to provide day-to-day support and mentorship across all levels, from apprentices to senior team members. More ❯
this position, you will support our team while driving change and developing new ideas that will shape the future of our operations. Role Scope: Management Structure & People Management • Serves as second port of call when IBX manager or supervisor is unavailable. • Supports IBX manager and site supervisors in … the CFE supervisor/manager. • Attends and contributes to Customer MBRs and any key customer meetings. • Acts as an escalation point during major incidents. IncidentManagement • Provides technical support during incidents - directs team to failed components and guides response. • Supports the incidentmanagement process, collaborating closely … act as a technical lead during incidents, confidently directing teams and troubleshooting complex issues. • Strong organisational skills and experience in planning projects, including vendor management and customer communications. • Collaborative and approachable - able to provide day-to-day support and mentorship across all levels, from apprentices to senior team members. More ❯
slough, south east england, United Kingdom Hybrid / WFH Options
Randstad Enterprise
Siemens UK – a global infrastructure group. This role is essential in supporting the operation and maintenance of London’s critical road charging and traffic management infrastructure. The successful candidates will play a vital role in monitoring live traffic systems, ensuring fault management, and providing first-line support to … such as 1st Line Support Technician, Desktop Support Technician, Service Desk Analyst, or similar IT support environments—particularly those who are confident working with incidentmanagement systems and have a strong eye for detail in technical monitoring scenarios. Responsibilities: Monitor and log incidents through the fault management … critical metrics Register faults and assist callers with routing their issues appropriately Support continuous improvement by providing system/process feedback Adhere to Quality Management, Health and Safety, and Security protocols Key Skills/Experience Required: Essential: Experience in a 1st Line Support or Service Management Centre environment More ❯