Product Quality and Support Strategist, Alerting and IncidentManagement About The Position Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring … and reducing observability spending by up to 70%. We seek a Quality and Support Strategist professional who ensures that the Coralogix Alerting and IncidentManagement Platform and Process exceed the quality and reliability standards, establish a competitive edge, and prevent failures, profit loss, or work stoppages. You … will be responsible for enhancing customer experience by ensuring efficient and effective alert management resolution, reducing engineering interruptions, and boosting product awareness. This role involves developing a robust knowledge base, identifying common usage issues, and creating solutions that establish the Alerting and IncidentManagement Platform's capabilities More ❯
Support Engineer - IncidentManagement, AWS Incident Response (AIR) Job ID: Amazon Development Centre Ireland Limited AWS Incident Response is at the heart of high availability of Amazon Web Services. We make customer impacting events shorter and less frequent by providing large scale event and incident … helps mitigate its impact, and much of our engineer time is spent on projects to improve the tooling and automation. We also provide manual incidentmanagement for AWS and other Amazon groups, directing the resolution of an issue with service teams, and diving deep into those events to … experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion. Key job responsibilities Critical Issue Resolution and Call Management: Act as the primary point of contact in a team rotation for customer impacting issues. Monitor performance graphs, drive resolution calls with a large More ❯
IncidentManagement Engineer, AWS Incident Detection and Response Sales, Marketing and Global Services (SMGS) AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. … success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. The AWS Incident Detection and Response team is part of the Enhanced Support Services (ES2) organisation within AWS Support, and is dedicated to offering eligible AWS Enterprise … Support customers proactive engagement and incidentmanagement to reduce the potential for failure and to accelerate recovery of critical workloads from disruption. We achieve these objectives by working closely with customers to develop runbooks and response plans customized to the context of each workload onboarded to the service. More ❯
our placemakers in delivering exceptional workplace experiences for our customers. Together, we make space for people and businesses to thrive. Location: London Job Title: Incident Manager - Barclays JOC To aid in the protection of Barclays operations through the reduction in the number of preventable incidents and to reduce incident impact through improvement of incident mitigation measures. The Incident Manager will be based within the Joint Operations Centre (JOC) in London and manage the ISS operational teams response to an Operational Facilities Incident which could potentially impact on Barclays through financial, regulatory or reputational damage. Main … purpose of the position : Working with a leading Global Bank we are changing the way we manage and report Incidents. As the ISS Incident Manager you will be part of a team of 4 working a 24/7 365 shift. Reporting to the IncidentManagement Team More ❯
Job Title: Incident Manager Location: Durham, North East England, DH1 1SL; Lytham St Anne's, North West England, FY8 4TS; Glasgow, Scotland, G2 8JX; City of Westminster, London (region), SW1P 3BT Remuneration: £42,315 - £45,300 Contract Details: Permanent Working Style: Hybrid Join Our Team! Are you an experienced … Incident Manager looking to make a real impact? Our client, a leading organisation in the Central Government sector, is on the lookout for a dynamic professional to support the Head of Service Operations. Why You Should Apply: Drive Change: Play a key role in improving incidentmanagement … resolution of incidents. Grow Professionally: Leverage your expertise in a high-volume operational environment, while enhancing your skills and knowledge. Key Responsibilities: Manage the IncidentManagement Process for Retail & B2B Clients. Ensure timely, accurate reporting of incidents and resolutions. Collaborate with Service Providers to identify and mitigate risks. More ❯
The Incident Resolution Specialist is responsible for owning the day-to-day incident resolution management process ensuring that disruptions to business operations are identified, recorded and resolved efficiently and effectively. This includes conducting root cause analysis and delivering the implementation of preventative measures leading to best-in … office teams and other internal stakeholders to resolve incidents, prioritising own workload according to the severity and urgency of the incidents. Follow the established incidentmanagement procedures and best practices and ensure compliance with policies and standards. Maintain clear and consistent communication with stakeholders during incidents, providing regular … Governance, Risk and Compliance (GRC) Tools to document, track progress, manage risks associated with front office operations, and identify trends and patterns. Conduct post-incident reviews to identify root causes and implement corrective actions to prevent recurrence, ensuring that the incident resolution meets at a minimum the agreed More ❯
Senior Product Engineer - AI-Powered IncidentManagement ?? Up to £130k + equity ?? London (Hybrid, 3-4 days in-office) We are partnering with an innovative B2B SaaS company that is transforming incidentmanagement for engineering teams. Their AI-driven platform integrates on-call scheduling, incident … With a focus on automation and intuitive design , their technology is already trusted by over 600 companies , helping over 10,000 responders optimise their incidentmanagement processes. Why Join? ?? Work on something new and unexplored - AI-powered incidentmanagement is an open challenge, and you'll More ❯
subsidiary of Betsson Group, is one of the world’s leading names in sports betting technology and trading. We provide real-time pricing, risk management and trading capabilities for sportsbooks globally; our platform supports sportsbooks through advanced technology solutions, including a hybrid infrastructure spanning on-premise VMware data centres … within an Agile environment, participating in sprint-based workflows and managing tasks through Atlassian tools, including Jira and your role will also involve handling incidentmanagement via PagerDuty as this includes responding to and efficiently resolving incidents to ensure operational systems are maintained and SLAs are met. Technical … moderately complex software configurations for deployment and system components. Server Administration and Operating Systems Intermediate experience with Linux and Windows server operating systems. Preferable IncidentManagement and Platform Resilience Understanding of incidentmanagement processes, including prioritising, diagnosing and resolving incidents. Root cause analysis (RCA) skills and More ❯
subsidiary of Betsson Group, is one of the world’s leading names in sports betting technology and trading. We provide real-time pricing, risk management and trading capabilities for sportsbooks globally; our platform supports sportsbooks through advanced technology solutions, including a hybrid infrastructure spanning on-premise VMware data centres … within an Agile environment, participating in sprint-based workflows and managing tasks through Atlassian tools, including Jira and your role will also involve handling incidentmanagement via PagerDuty as this includes responding to and efficiently resolving incidents to ensure operational systems are maintained and SLAs are met. Technical … moderately complex software configurations for deployment and system components. Server Administration and Operating Systems Intermediate experience with Linux and Windows server operating systems. Preferable IncidentManagement and Platform Resilience Understanding of incidentmanagement processes, including prioritising, diagnosing and resolving incidents. Root cause analysis (RCA) skills and More ❯
Dartford, London, United Kingdom Hybrid / WFH Options
Bridge Recruitment UK Ltd
accounts Undertake customer operational service performance reporting covering service performance against SLAs, trends, costs & improvement plans reviewed with the business owners Take the service management lead in customer major incidentmanagement, working in conjunction with internal teams to drive service restoration, and producing high quality post major … incident reports (PMIRs) Take ownership of incidents and requests that are passed to the service delivery teams, providing the requisite support to the wider IT operation Manage both internal and external stakeholders effectively, developing and maintaining lasting relationships Engagement with sales teams and PMO, supporting commercial activities and account … plans As part of continuous improvement, assist with relevant ITIL disciplines - incidentmanagement, service transition, problem management, change management, service-level management etc Chair relevant customer meetings Manage customer service on-boarding Support customer retention by resolving customer concerns, liaise with internal teams to resolve More ❯
We have a fantastic opportunity for a talented individual to join Marsh in our Cyber Claims and IncidentManagement team in London. This is a hybrid role that has a requirement to work three days per week in the office. The role: Cyber Claims Advocate Marsh is seeking … a dedicated Cyber Claims Advocate to join our dynamic Cyber Claims and IncidentManagement Team. This is an exciting opportunity for Claims Specialists or individuals with 1-3 years of experience in Cyber Claims to enhance their career in a fast-paced environment. This role offers the opportunity … and excess insurers, advocating for clients through submissions and escalation meetings as necessary. Respond to client and broker queries regarding policy coverage and cyber incident response pre-incident/loss. Assist in managing vendor relationships and creating visually appealing content and presentations. What you need to have: Advanced More ❯
close collaboration with other IT Domains, such as IT Strategy and Architecture, Change and Portfolio and Digital Safety teams. Job Purpose The Service Portfolio Management Specialist manages either the I&O Platforms or Workplace Products portfolio, ensuring seamless delivery planning, service portfolio optimisation, and portfolio alignment within other portfolios. … the Platform/Product Increment (PI) planning sessions within their portfolio to align delivery roadmaps and manage dependencies within and across teams. IT Stakeholder Management - Communication & Coordination Effectively communicating about their service portfolio and plan and ensuring coordination and alignment with the rest of the IT portfolio. Collaborating on … planning and forecasting I&O Platform or Workplace Product demand and capacity with the I&O Platforms Partner or Workplace Product leadership. Risk & Compliance Management - Delivery Risk Management Proactively collaborating with Product and Platforms teams in order to identify risks in the delivery of the service portfolio. Defining More ❯
Product Manager, Alerting and IncidentManagement About The Position Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of logs, metrics … and reducing observability spending by up to 70%. We are seeking a Product Manager to drive our product vision for the Alerting and Incidentmanagement tooling across our platform and provide the robust foundation for Observability and Security teams to build on. Our Alerting engine offers real … deliver platform capabilities that can be leveraged across both Internal Products and Customerfacing User experience. Responsibilities Define the vision and roadmap for alerting and incident management. Collaborate with the Observability and Security teams to discover, align, and prioritize their needs, understand use cases and timing requirements, and plan phased More ❯
safety, security, availability, productivity, capacity and efficiency. We are looking for a proven Data Center Engineering Operations (DCEO) Engineer with experience in critical facilities management, and a result-driven individual with strong technical understanding and the drive and vision to take our data center operations to the next level. … for critical systems including switchgear, generators, UPS systems, power distribution equipment, chillers, cooling towers, computer room air handlers, building monitoring systems, etc.; Generate change management requests & incidentmanagement tickets for Data Center facility; Engage in non-office on-call responsibilities and respond promptly to emergency situations or … incident, such as power failures or natural disasters, and review to refine the established Emergency Operations Procedures to effectively handle these situations; Deeply involve with Colocation provider and contractors to resolve infrastructure engineering issues and conduct root cause analysis for operational issues; Establish comprehensive English documentation to business & facility More ❯
South East London, London, United Kingdom Hybrid / WFH Options
Resolver Inc
of their risks so they can make quick and effective decisions. As a part of the Resolver team, your work will help transform risk management to risk intelligence so organizations can protect people and assets and deliver on their purpose. We are ambitious in both our mission and our … successful implementations Creating design alternatives, producing work estimates, recommendations, and securing agreement on designs that satisfy customer requirements and reflect industry best practices Project Management Establishing a shared vision of project success during project initiation and confirming a common understanding of project scope, delivery approach, task ownership, and deliverables … Controlling and communicating project scope, schedule, budget, and risk to customers and management Leading regular project discussions with customers and project teams to review work plans, risks, actions, issues, and decisions that drive projects to completion and minimize time to value Business Analysis Gathering and documenting customer functional and More ❯
of their risks so they can make quick and effective decisions. As a part of the Resolver team, your work will help transform risk management to risk intelligence so organizations can protect people and assets and deliver on their purpose. We are ambitious in both our mission and our … successful implementations Creating design alternatives, producing work estimates, recommendations, and securing agreement on designs that satisfy customer requirements and reflect industry best practices Project Management Establishing a shared vision of project success during project initiation and confirming a common understanding of project scope, delivery approach, task ownership, and deliverables … Controlling and communicating project scope, schedule, budget, and risk to customers and management Leading regular project discussions with customers and project teams to review work plans, risks, actions, issues, and decisions that drive projects to completion and minimize "time to value" Business Analysis Gathering and documenting customer functional and More ❯
collaborative and dedicated security and business continuity manager, to focus on the protection of project critical assets and maintain Business Continuity (BC), Resilience and IncidentManagement (IM) strategies and plans. Reporting to the head of security, the security and business continuity manager will be responsible for implementing a … meetings, read literature, and participate in training or other educational offerings to keep abreast of new developments and technologies related to business continuity and incident management. Create and oversee area contingency plans, development, and operation. Create BC plans for core risks. (7 defined risks) Create and administer training and … data to HS2 Compile and submit quarterly return for HS2 on SCS resilience capability The Ideal Candidate Required Qualifications & Skills Proven experience in Security Management, Business Continuity, Risk Management, or other resilience disciplines Prior experience in Business Continuity/HILP (High Impact, Low Probability) risk management functions More ❯
IT security incidents and threats. This role demands a proactive individual with a strong technical background and problem-solving mindset. Key Responsibilities: Network Infrastructure Management: Must have experience in Cisco Meraki. Manage and maintain the wireless infrastructure, including wireless controllers and Access Points (APs). Oversee internet connectivity through … load balancing. Monitor and troubleshoot network performance, ensuring minimal downtime. Maintain and document network configurations, including IP addressing and VLAN setups. Firewall and Security Management: Must have experience in Sophos Firewall. Configure, manage, and update firewall policies to ensure network security. Handle VPN setups and user access management. Regularly … update and monitor firewall firmware to safeguard against vulnerabilities. Endpoint Security and Antivirus Management: Must have experience in Bitdefender or similar. Manage antivirus server installations and ensure deployment of agents across all endpoints. Enforce endpoint security policies, including USB port restrictions and system-level password management. Implement and oversee More ❯
companies in history, an organization that is in the center of the hurricane being created by the revolution in artificial intelligence. "VAST's data management vision is the future of the market." - Forbes VAST Data is the data platform company for the AI era. We are building the enterprise … in a 24/7 network operations center-style environment, ensuring the availability, reliability, and security of services. This role involves real-time monitoring, incident detection, incidentmanagement, incident resolution, and clear written and verbal communication with other teams and stakeholders. The Role Monitor clusters using … processes. Perform initial investigation and diagnosis of problems, escalating complex issues to support. Document incidents, including their details, troubleshooting steps, and resolutions in the incident tracking system. Collaborate with other teams, including Support, R&D, Account teams, and customers to ensure effective incident resolution and communication. Conduct routine More ❯
and unexpected disruptions and adapting to changes in our operating environment. Within the area of Security, Operational Resilience covers three separate but interconnected disciplines: Incident and Crisis Management (IM/CM), Business Continuity Management (BCM) and IT Service Continuity Management & IT Recovery (ITSCM & ITR). These … key action plans. Maintain the DOR Testing Framework, manage attestation results, and ensure testing procedures are documented and approved according to the ICT Risk Management Framework and in coordination with the Risk function. Work closely with testing owners across Security and Global Technology (IT), and AXA Group to align … and report overall DORT effectiveness to the ICT Risk Management Framework. Ensures that testing owners maintain and annually refresh the respective testing standards included in the DORT Framework. Review and analyse data from a maintained Dashboard, sample test reports, and additional evidence provided by testing owners to ensure the More ❯
and unexpected disruptions and adapting to changes in our operating environment. Within the area of Security, Operational Resilience covers three separate but interconnected disciplines: Incident and Crisis Management (IM/CM), Business Continuity Management (BCM) and IT Service Continuity Management & IT Recovery (ITSCM & ITR). These … key action plans. Maintain the DOR Testing Framework, manage attestation results, and ensure testing procedures are documented and approved according to the ICT Risk Management Framework and in coordination with the Risk function. Work closely with testing owners across Security and Global Technology (IT), and AXA Group to align … and report overall DORT effectiveness to the ICT Risk Management Framework. Ensures that testing owners maintain and annually refresh the respective testing standards included in the DORT Framework. Review and analyse data from a maintained Dashboard, sample test reports, and additional evidence provided by testing owners to ensure the More ❯
Banking System. You will oversee their deployment, development, enhancements and production operations. You should have extensive experience and skills in Software Engineering, DevOps, project management, incidentmanagement, team management and stakeholder management. Additionally, you should have expertise in: Multi-threaded Java backend development API and SQL More ❯
Employment Type: Permanent
Salary: £125000 - £145000 per day + Equity and Benefits
Banking System. You will oversee their deployment, development, enhancements and production operations. You should have extensive experience and skills in Software Engineering, DevOps, project management, incidentmanagement, team management and stakeholder management. Additionally, you should have expertise in: Multi-threaded Java backend development API and SQL More ❯
to safeguard critical business operations by design and default. You will be responsible for security automation, CI/CD pipeline enhancements , and cloud security management , ensuring compliance with industry standards. Key Responsibilities Security & DevOps Integration: Support and extend the secured CI/CD pipeline to enhance development security. Work … secure AWS cloud infrastructure for clients and internal operations. Automate AWS infrastructure builds following CIS hardening standards . Ensure top-tier security configuration, access management, and incident response on cloud platforms. Operational Support & Incident Response: Support business-critical Windows and Linux-based environments. Monitor and respond to … security alerts across Infosec, servers, firewalls, and applications. Conduct continuous monitoring of internal and third-party information security controls. Threat & Vulnerability Management: Assess SAST (Static Application Security Testing) and DAST (Dynamic Application Security Testing) scans. Implement remediation and mitigation strategies in collaboration with development teams. Maintain network security protocols More ❯
to safeguard critical business operations by design and default. You will be responsible for security automation, CI/CD pipeline enhancements , and cloud security management , ensuring compliance with industry standards. Key Responsibilities Security & DevOps Integration: Support and extend the secured CI/CD pipeline to enhance development security. Work … secure AWS cloud infrastructure for clients and internal operations. Automate AWS infrastructure builds following CIS hardening standards . Ensure top-tier security configuration, access management, and incident response on cloud platforms. Operational Support & Incident Response: Support business-critical Windows and Linux-based environments. Monitor and respond to … security alerts across Infosec, servers, firewalls, and applications. Conduct continuous monitoring of internal and third-party information security controls. Threat & Vulnerability Management: Assess SAST (Static Application Security Testing) and DAST (Dynamic Application Security Testing) scans. Implement remediation and mitigation strategies in collaboration with development teams. Maintain network security protocols More ❯