Incident Management Jobs in Berkshire

1 to 25 of 107 Incident Management Jobs in Berkshire

Incident Management & Prevention Analyst

slough, south east england, United Kingdom
Altum Consulting
Altum Consulting are working with a global entertainment company to recruit an Incident Management Analyst based in Central London. This role will be for an initial 12-month FTC . Applying candidates must be on a short notice period. Incident Management Analyst role: Incident management … documents adhere to established guidelines and best practices Reporting and data analysis Liaising with finance, cybersecurity and other teams Continuous process improvement As an Incident Management Analyst you'll be/have: Bachelor's Degree (Accounting/Finance/Cybersecurity/Data Science etc.) Experience in an identity More ❯
Posted:

Product Specialist

slough, south east england, United Kingdom
ITR Partners
We are seeking a strategic and technically hands-on Product Quality & Support Strategist to help scale a cutting-edge Alerting & Incident Management Platform. This is a pivotal role focused on improving customer experience, reducing engineering disruption, and enhancing the platform’s overall product quality and technical impact. You … or a related technical field 5+ years of hands-on software/infrastructure engineering experience, especially with DevTools or Infrastructure Engineering Strong knowledge of incident management, alert routing, on-call workflows, and incident response strategies Experience working with or supporting SaaS-based observability or incident management … Hands-on experience with tools like PagerDuty, OpsGenie, ServiceNow, CloudWatch, Chronosphere, or similar Understanding of SLA/SLO implementation and performance tracking Exposure to incident management frameworks, automated remediation, and runbook automation Background in DevOps or SRE culture and tooling Prior people leadership or mentorship experience is a More ❯
Posted:

OT Cyber Security Analyst

Reading, Oxfordshire, United Kingdom
Hybrid / WFH Options
Thames Water Utilities Limited
you will be responsible for maintaining SecOps solutions, controls, and processes across the organisation, while mentoring and leading the SOC team to ensure effective management of OT alerts and incidents. This position requires a deep understanding of SecOps concepts, technologies, and best practices, specifically across IT and OT environments. … You will be tasked with ensuring robust incident management, proactive threat detection, and continuous improvement of our security posture. Strong communication and collaboration skills are essential as you will work closely with cross-functional teams to mitigate risks and protect Thames Water's essential services. What you'll … activities such as threat hunting to uncover vulnerabilities and ensure continuous risk reduction. • Provide tangible metrics to demonstrate risk reduction and reduced technical debt. Incident Readiness & Response: • Lead the incident triage and response process, ensuring effective management and remediation of cyber security incidents. • Improve incident management More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Sr Manager, Digital Experience GenAI Platforms Operations

Slough, Berkshire, United Kingdom
ENGINEERINGUK
s in Computer Science, Engineering, Data Science, or related field 8+ years in enterprise technology, with 3-5 years in platform operations or service management Experience managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven in scaling MLOps or LLMOps practices … cross-functional teams in complex ecosystems Familiarity with monitoring tools like Prometheus, Grafana, Azure Monitor Excellent communication and stakeholder engagement skills Experience managing SLAs, incident management, continuous improvement Strategic and hands-on in fast-paced environments Professional English proficiency What will be your key responsibilities? In this role … eXperiences (MAX) Platform, ensuring high availability, scalability, and performance. LLMOps Implementation: Develop and operationalize LLMOps practices, including deployment, monitoring, versioning, and performance tuning. Service Management & Support: Establish incident management, SLAs, change management, and continuous improvement processes. Governance & Compliance: Ensure adherence to Responsible AI, data privacy, and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Operational Resilience Analyst

slough, south east england, United Kingdom
Hybrid / WFH Options
Lawrence Harvey
further implement the Digital Operational Resilience Framework across the company e.g., refining and optimising existing policies, plans and procedures (in areas such as Risk Management, Incident Management, Business Continuity, Crisis Management, Third-Party Risk Management and Disaster Recovery), supporting the implementation of new technologies to … programme planning and overseeing the ongoing execution and reporting of testing as per the test schedule and remediation of gaps/vulnerabilities identified. Collating Management Information reporting from various business stakeholders on a quarterly basis to ensure effective reporting on resilience levels of Critical Functions to Senior Management … ensure regulatory requirements are clearly understood and documented. Preparing documentation to facilitate i) status reporting on specific projects and ii) regular reporting to Senior Management and Board of Directors at Committee Meetings. Participation in the internal/external audits and inspections as required. Attending industry events to keep abreast More ❯
Posted:

Operational Resilience Analyst

reading, south east england, United Kingdom
Hybrid / WFH Options
Lawrence Harvey
further implement the Digital Operational Resilience Framework across the company e.g., refining and optimising existing policies, plans and procedures (in areas such as Risk Management, Incident Management, Business Continuity, Crisis Management, Third-Party Risk Management and Disaster Recovery), supporting the implementation of new technologies to … programme planning and overseeing the ongoing execution and reporting of testing as per the test schedule and remediation of gaps/vulnerabilities identified. Collating Management Information reporting from various business stakeholders on a quarterly basis to ensure effective reporting on resilience levels of Critical Functions to Senior Management … ensure regulatory requirements are clearly understood and documented. Preparing documentation to facilitate i) status reporting on specific projects and ii) regular reporting to Senior Management and Board of Directors at Committee Meetings. Participation in the internal/external audits and inspections as required. Attending industry events to keep abreast More ❯
Posted:

Sr Manager, Digital Experience GenAI Platforms Operations

Slough, Berkshire, United Kingdom
Mars, Incorporated and its Affiliates
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3-5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incident management, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Sr Manager, Digital Experience GenAI PlatformsOperations

Windsor, England, United Kingdom
Mars IS UK
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incident management, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Posted:

Sr Manager, Digital Experience GenAI PlatformsOperations

Slough, England, United Kingdom
Mars IS UK
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incident management, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Posted:

Sr Manager, Digital Experience GenAI PlatformsOperations

Maidenhead, England, United Kingdom
Mars IS UK
Science, or a related technical field 8+ years of experience in enterprise technology roles, with 3–5 years focused on platform operations or service management Hands-on experience with managing GenAI/ML platforms and LLM-based services (e.g., OpenAI, Anthropic, Azure OpenAI, Hugging Face) Proven track record in … implementing and scaling MLOps or LLMOps practices in a production environment Certifications in cloud platforms (e.g., Azure, AWS, GCP) and/or ITIL Service Management preferred Advanced coursework or certifications in AI/ML, MLOps, or LLMOps is a strong plus Ongoing learning and participation in GenAI or platform … tools (e.g., Prometheus, Grafana, Azure Monitor) Exceptional communication and stakeholder engagement skills to partner with business, technical, and governance teams Experience managing platform SLAs, incident management, and continuous improvement cycles in high-availability environments Ability to balance strategic thinking with hands-on execution in a fast-paced, evolving More ❯
Posted:

Cyber Claims Advocate

slough, south east england, United Kingdom
Hybrid / WFH Options
Marsh McLennan
Company: Marsh Description: We have a fantastic opportunity for a talented individual to join Marsh in our Cyber Claims and Incident Management team in London. This is a hybrid role that has a requirement to work three days per week in the office. The role: Cyber Claims Advocate … Marsh is seeking a dedicated Cyber Claims Advocate to join our dynamic Cyber Claims and Incident Management Team. This is an exciting opportunity for Claims Specialists or individuals with 1-3 years of experience in Cyber Claims to enhance their career in a fast-paced environment. This role … and excess insurers, advocating for clients through submissions and escalation meetings as necessary. Respond to client and broker queries regarding policy coverage and cyber incident response pre-incident/loss. Assist in managing vendor relationships and creating visually appealing content and presentations. What you need to have: Advanced More ❯
Posted:

IT Support Team Lead (Law Firm)

slough, south east england, United Kingdom
Hybrid / WFH Options
DGH Recruitment
law firm on a permanent basis. **Experience working for a law firm is essential** Key Responsibilities Manage the User Support team using the Service Management toolset and provides accurate reporting on performance Coaching and mentoring the team to ensure that personal development, service standards and are being consistently applied … identify problems and look to address these with the respective team Carry out all relevant technical support and maintenance activities as required by Change Management, Incident Management, Problem Management, and Service Request Management processes. Take accountability for service delivery performance, meeting customer expectations Liaise with … other internal support teams, internal senior management and suppliers in the day-to-day management of Incidents and Service Requests. And where appropriate initiate the escalation process for Major Incidents Take ownership of major incidents, coordinate with resolution parties, and establish effective communication between stakeholders for post-incident More ❯
Posted:

IT Service Delivery Lead/IT Support Lead

slough, south east england, United Kingdom
Hybrid / WFH Options
DGH Recruitment
law firm on a permanent basis. IT Service Delivery Lead/IT Support Lead Key Responsibilities: • Manage the User Support team using the Service Management toolset and provides accurate reporting on performance. • Coaching and mentoring the team to ensure that personal development, service standards and are being consistently applied. … identify problems and look to address these with the respective team • Carry out all relevant technical support and maintenance activities as required by Change Management, Incident Management, Problem Management, and Service Request Management processes. • Take accountability for service delivery performance, meeting customer expectations • Liaise with … other internal support teams, internal senior management and suppliers in the day-to-day management of Incidents and Service Requests. And where appropriate initiate the escalation process for Major Incidents. • Take ownership of major incidents, coordinate with resolution parties, and establish effective communication between stakeholders for post-incident More ❯
Posted:

Senior Information Technology Service Management Consultant

slough, south east england, United Kingdom
La Fosse
Analyse current IT operations and identify gaps and opportunities for ITSM process implementation. Design, document, and implement ITSM processes including (but not limited to) Incident Management, Problem Management, Change Management, Service Request Management, Knowledge Management, and Configuration Management. Collaborate with stakeholders across departments to … needs. Define KPIs and metrics for ITSM processes and set up reporting dashboards. Configure and customize ITSM tools/platforms (e.g., ServiceNow, Jira Service Management, BMC Remedy). Provide training, guidance, and documentation for internal teams. Ensure alignment with ITIL or other relevant frameworks and industry standards. Qualifications: Deep … framework and ITSM best practices. Demonstrated experience building ITSM processes from scratch. Experience with ITSM platforms such as ServiceNow, Freshservice, or similar. Excellent stakeholder management and communication skills. ITIL certification (v3 or v4) strongly preferred. Does this sound like you? Please a pply below More ❯
Posted:

Incident Assurance Manager

Reading, Oxfordshire, United Kingdom
Hybrid / WFH Options
Mobile Broadband Network Limited
Incident Assurance Manager Job ID PERM002892ML Department Details The Operational Services directorate is accountable for ensuring the network sites are always accessible and available. It undertakes the operation, enablement, and management of the network infrastructure to enable EE/BT and Three to deliver their best customer experiences … at the lowest cost. Reporting to the Senior Incident Assurance Manager, this role will involve relentlessly managing the delivery of Incident Assurance services by the supplier ecosystem to achieve agreed business outcomes and performance targets set by EE/BT, Three and the MBNL AOP. This is a … minimum of two days per week in our Central Reading office. What you will do: Manage and proactively drive Service and Site Availability and Incident resolution and ticketing KPI's and quality issues against contractual obligations and industry benchmarks. Identify and contribute efforts which will improve the methodologies, processes More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

slough, south east england, United Kingdom
IGT Solutions
add to the event catalog for the relevant product or application. Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring. Perform incident response and root cause analysis for critical system failures. Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs). Collaborate with … for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident management teams, operations … different Service Operations and Engineering teams to develop and implement permanent solutions. Monitor the effectiveness of problem resolution activities, provide regular reports on problem management activities, and ensure continuous improvement. Event Management Define and maintain an event catalog, specifying active events, thresholds, and relevant remediation, and optimize it More ❯
Posted:

ITSM

slough, south east england, United Kingdom
Vallum Associates
Energy/Utilities industry experience is must* Key Responsibilities Service Delivery Management: · Oversee the performance of IT services, ensuring they meet agreed service levels (SLAs) and key performance indicators (KPIs) · Ensure all services are delivered on time, securely, and where appropriate, within the associated commercial and contractual obligations. · Manage … with third-party service providers and internal stakeholders to ensure effective service delivery as per agreed SLAs/OLAs · Responsible for overseeing the knowledge management process and related activities, including the capturing, sharing and accessibility of knowledge articles within ServiceNow SIAM-based Supplier Coordination: · Implement and manage the SIAM … agreements. · Act as the primary point of contact between Client internal teams and external service providers for IT service related issues and escalations. Major Incident Management (MIM): · Accountable for effective management of the major incident management process ensuring that all major incidents are resolved in More ❯
Posted:

DevSecOps Engineer

slough, south east england, United Kingdom
Hazeltree
to safeguard critical business operations by design and default. You will be responsible for security automation, CI/CD pipeline enhancements , and cloud security management , ensuring compliance with industry standards. Key Responsibilities Security & DevOps Integration: Support and extend the secured CI/CD pipeline to enhance development security. Work … secure AWS cloud infrastructure for clients and internal operations. Automate AWS infrastructure builds following CIS hardening standards . Ensure top-tier security configuration, access management, and incident response on cloud platforms. Operational Support & Incident Response: Support business-critical Windows and Linux-based environments. Monitor and respond to … security alerts across Infosec, servers, firewalls, and applications. Conduct continuous monitoring of internal and third-party information security controls. Threat & Vulnerability Management: Assess SAST (Static Application Security Testing) and DAST (Dynamic Application Security Testing) scans. Implement remediation and mitigation strategies in collaboration with development teams. Maintain network security protocols More ❯
Posted:

Power Platform Developer

slough, south east england, United Kingdom
Ada Meher
Developer - London Hybrid - £60,000 I am looking for a Senior Power Platform Developer to join my client, an independent global insurance and investment management company, based in London. This company provide a wide range of managed services to mutuals across different verticals. Services include claims, underwriting and risk … management and compliance, and more. This role offers the opportunity to work with cutting-edge technologies, designing and delivering innovative solutions that drive business efficiency and digital transformation in a dynamic and fast-paced industry. The ideal candidate will bring expertise in the Microsoft Power Platform suite and a … Power BI, and Dynamics 365, to deliver robust and innovative business applications. Strong knowledge of Microsoft Dataverse, with a particular focus on efficient data management practices, scalable solutions, and ensuring data security across various applications. Extensive experience in setting up and managing CI/CD pipelines for Power Platform More ❯
Posted:

Director of DevOps

slough, south east england, United Kingdom
IN2-SaaS | International Software-as-a-Service Recruiters
CI/CD, and automation while guiding the Cloud Infrastructure & DevOps teams. The ideal candidate brings extensive knowledge in cloud infrastructure, DevOps, and project management, as well as a proven ability to manage and mentor high-performing teams. Key Responsibilities DevOps Strategy : Define and implement a cohesive DevOps vision … and ensure compliance with security standards. Collaboration : Work closely with software engineering, QA, and product teams to streamline workflows and enhance software quality. Cost Management : Optimize cloud costs and work with finance to manage budgets. Incident Management : Ensure effective monitoring, incident management, and root cause More ❯
Posted:

Director of DevOps

reading, south east england, United Kingdom
IN2-SaaS | International Software-as-a-Service Recruiters
CI/CD, and automation while guiding the Cloud Infrastructure & DevOps teams. The ideal candidate brings extensive knowledge in cloud infrastructure, DevOps, and project management, as well as a proven ability to manage and mentor high-performing teams. Key Responsibilities DevOps Strategy : Define and implement a cohesive DevOps vision … and ensure compliance with security standards. Collaboration : Work closely with software engineering, QA, and product teams to streamline workflows and enhance software quality. Cost Management : Optimize cloud costs and work with finance to manage budgets. Incident Management : Ensure effective monitoring, incident management, and root cause More ❯
Posted:

Lead Engineer

Slough, Berkshire, United Kingdom
Digital Realty
operated and maintained safely, within SLA, and with minimal risk. As the main point of contact, the Lead Engineer undertakes the day-to-day management of all activities of the Technical Operations Site Team, ensuring that all reactive and planned maintenance tasks are delivered to agreed timelines and SLAs … maintenance tasks on building services infrastructure, ensuring compliance with CMMS systems and operational procedures. Monitor SLA compliance, communicate effectively with client representatives, and facilitate incident management processes. Provide support to Data Center and Site Management teams, including communication assistance, technical information provision, and holiday/sickness coverage … to building services infrastructure issues, collaborating with Site Engineers, and escalating to third-party specialist contractors where specialist knowledge and skills are required Coordinate incident management, maintaining & issuing incident reports using CMMS systems. Operate within defined Service Operations processes and work within site-specific standard operating procedures. More ❯
Employment Type: Permanent
Posted:

Senior Critical Facilities Engineer (Lead Specialist, VI)

Slough, Berkshire, United Kingdom
Equinix, Inc
this position, you will support our team while driving change and developing new ideas that will shape the future of our operations. Role Scope: Management Structure & People Management • Serves as second port of call when IBX manager or supervisor is unavailable. • Supports IBX manager and site supervisors in … the CFE supervisor/manager. • Attends and contributes to Customer MBRs and any key customer meetings. • Acts as an escalation point during major incidents. Incident Management • Provides technical support during incidents - directs team to failed components and guides response. • Supports the incident management process, collaborating closely … act as a technical lead during incidents, confidently directing teams and troubleshooting complex issues. • Strong organisational skills and experience in planning projects, including vendor management and customer communications. • Collaborative and approachable - able to provide day-to-day support and mentorship across all levels, from apprentices to senior team members. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Technical Mechanical/Electrical Engineer VI (Monday - Friday, Day Role)

Slough, Berkshire, United Kingdom
Equinix, Inc
this position, you will support our team while driving change and developing new ideas that will shape the future of our operations. Role Scope: Management Structure & People Management • Serves as second port of call when IBX manager or supervisor is unavailable. • Supports IBX manager and site supervisors in … the CFE supervisor/manager. • Attends and contributes to Customer MBRs and any key customer meetings. • Acts as an escalation point during major incidents. Incident Management • Provides technical support during incidents - directs team to failed components and guides response. • Supports the incident management process, collaborating closely … act as a technical lead during incidents, confidently directing teams and troubleshooting complex issues. • Strong organisational skills and experience in planning projects, including vendor management and customer communications. • Collaborative and approachable - able to provide day-to-day support and mentorship across all levels, from apprentices to senior team members. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

1st Line Support Operator

slough, south east england, United Kingdom
Hybrid / WFH Options
Randstad Enterprise
Siemens UK – a global infrastructure group. This role is essential in supporting the operation and maintenance of London’s critical road charging and traffic management infrastructure. The successful candidates will play a vital role in monitoring live traffic systems, ensuring fault management, and providing first-line support to … such as 1st Line Support Technician, Desktop Support Technician, Service Desk Analyst, or similar IT support environments—particularly those who are confident working with incident management systems and have a strong eye for detail in technical monitoring scenarios. Responsibilities: Monitor and log incidents through the fault management … critical metrics Register faults and assist callers with routing their issues appropriately Support continuous improvement by providing system/process feedback Adhere to Quality Management, Health and Safety, and Security protocols Key Skills/Experience Required: Essential: Experience in a 1st Line Support or Service Management Centre environment More ❯
Posted:
Incident Management
Berkshire
10th Percentile
£28,500
25th Percentile
£37,250
Median
£37,500
75th Percentile
£45,813
90th Percentile
£51,250