virtualisation platforms, storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incidentresponse, root cause analysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tate Professional
virtualisation platforms, storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incidentresponse, root cause analysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the More ❯
ambitious roadmap, but we also collaborate closely with Engineering squads to deliver complex transversal initiatives, and look at how we can constantly improve the developer experience. Operational Excellence: Lead incidentresponse efforts, overseeing the investigation and resolution of infrastructure-related issues, including following up of post-mortem actions and championing this across the business. This person is the More ❯
video services. Operational Excellence Act as a change agent to drive continuous improvement in operational processes, procedures, and tooling. Define and enforce best practices for service monitoring, alerting, and incident response. Incident Management Lead the technical recovery of production services during major incidents, ensuring rapid resolution and minimal impact. Conduct post-incident reviews and implement corrective actions … Strong understanding of video streaming protocols, encoding/transcoding workflows. Demonstrated ability to lead technical recovery during high-pressure incidents Familiarity with observability tools (e.g., Grafana, Prometheus, Datadog) and incident management platforms (e.g., PagerDuty, Opsgenie). Excellent communication and stakeholder management skills. Strong analytical and problem-solving abilities. What's in it For You? Hybrid Work Model: We've More ❯
more cloud platforms: Azure, AWS, or GCP. •Strong background in cloud-native engineering and solution architecture. •Demonstrated experience in applying SRE practices such as SLIs, SLOs, error budgets, and incident response. •Exposure to AI/ML technologies and their application in DevOps and SRE. •Strong leadership and mentoring experience. •Excellent communication, collaboration, and problem-solving skills. •Bachelor’s or More ❯
and product teams to bring features to life efficiently, balancing speed with technical excellence. Delivery & Operations: Oversee day-to-day engineering operations, from sprint planning to release cycles and incident response. Security & Compliance: Ensure the platform meets industry best practices around security, data privacy (e.g., GDPR), and compliance standards (e.g., SOC 2, ISO 27001). Stakeholder Communication: Act as More ❯
more cloud platforms: Azure, AWS, or GCP. • Strong background in cloud-native engineering and solution architecture. • Demonstrated experience in applying SRE practices such as SLIs, SLOs, error budgets, and incident response. • Exposure to AI/ML technologies and their application in DevOps and SRE. • Strong leadership and mentoring experience. • Excellent communication, collaboration, and problem-solving skills. • Bachelor’s or More ❯
We are representing a consultancy that are a leader in the Cyber Security and Incidentresponse space. If you have experience leading the legal aspects of Data Breach case this could be the role for you. This role is open to any of the multiple offices my client has across the UK. The client is looking for a … Principal Associate to support and shape the delivery of expert incidentresponse, digital risk, and cyber advisory services for a broad portfolio of global clients, from tech innovators and major insurers to public sector bodies and emergency services. This award-winning cyber group is uniquely positioned at the intersection of law, digital forensics, and strategic response. With capabilities … that span incidentresponse, regulatory strategy, privacy law, threat intelligence, security controls, and tech litigation, they’re rewriting how legal support is delivered in high-pressure digital environments. What You’ll Be Doing You’ll play a critical role across matters ranging from real-time cyber incidents to regulatory investigations, and ongoing advisory support. Key responsibilities include: Leading More ❯
We are representing a consultancy that are a leader in the Cyber Security and Incidentresponse space. If you have experience leading the legal aspects of Data Breach case this could be the role for you. This role is open to any of the multiple offices my client has across the UK. The client is looking for a … Principal Associate to support and shape the delivery of expert incidentresponse, digital risk, and cyber advisory services for a broad portfolio of global clients, from tech innovators and major insurers to public sector bodies and emergency services. This award-winning cyber group is uniquely positioned at the intersection of law, digital forensics, and strategic response. With capabilities … that span incidentresponse, regulatory strategy, privacy law, threat intelligence, security controls, and tech litigation, they’re rewriting how legal support is delivered in high-pressure digital environments. What You’ll Be Doing You’ll play a critical role across matters ranging from real-time cyber incidents to regulatory investigations, and ongoing advisory support. Key responsibilities include: Leading More ❯
we specialise in cloud migration and development, digital transformation including agile software development, DevOps, automation, data and machine learning We're looking for a hands-on Service Operations & Incident Manager to support a dynamic team managing platforms that power data-driven loyalty solutions for global partners. This is a high visibility role where you'll work across multiple functions … ensuring service stability, managing incidents, and driving continuous improvement. You'll work closely with a range of major loyalty partners, maintaining strong service performance, smooth partnership interactions, and seamless incident handling; requiring a mix of relationship management, technical understanding, and operational leadership. Please note: You must be able to attend our customer office in Warrington a few times per … month. Key responsibilities: Act as a frontline lead for service operations across partner-facing platforms Manage incidentresponse end-to-end: triage, coordination, communication and resolution Run post-incident reviews, coordinate root cause analysis (RCAs) and support ongoing problem management Work across teams to track service stability, performance and improvement opportunities Collaborate with product and delivery teams More ❯
to you, please read on. Are you interested in ensuring customer can always watch their favourite movie or show? If so, you might be the right person for the Incident Mgr role in the READI team who drive availability for Prime Video. Key job responsibilities - Lead calls on customer impacting, high severity, outages that drive towards resolution by co … ordinating efforts across multiple engineering and operational teams, including for ambiguous problems we might not have seen before. - Deconstruct complex incidents into workstreams that can be managed by multiple incident responders in parallel. - Monitor and manage communications during high severity events via relevant channels, including being the single point of contact for executive leaders - Drive critical, complex customer escalations … in situations that are sometimes technically challenging in collaboration with Engineering Teams - Own improving the effectiveness of incidentresponse by driving continuous improvement of standard operating procedures and the tools that help you resolve incidents efficiently. - Proactively identify opportunities for improvement through gap analysis, trend identification, and cross-functional collaboration. - Act as a key stakeholder for the engineering More ❯
Manager for the EMEAs region. In this role, you will be supporting the creation and enforcement of Jefferies’ Business Continuity Program, including policy reviews/updates, business impact analysis, incident monitoring and response and more. This role will also help lead the BC Regulatory program to horizon scan for regulatory updates/changes that would apply, and provide … in Compliance and Legal. Recommend recovery strategies and assist with implementation of recovery solutions. Plan and coordinate regular testing exercises and simulations to test the effectiveness of BC/incident management plans and to fulfill various regulatory requirements. Participate in any internal and industry wide tabletop exercises Support and lead Business Continuity awareness training for new employees and recurring … coordinate security alerts and the traveler safety program for potential risks to Jefferies staff and offices Monitoring news & alerts for incidents that may affect Jefferies’ offices and travelers Support incidentresponse efforts, specifically documenting and gathering timelines, data points and action items, and following up with responsible parties for close-out of assigned action items. Collaborate with various More ❯
sponsored events in the UK and the wider region. Deliver the Region's Security Awareness and Education initiatives, including operational security, risk management, resilience, and brand protection. Manage security incident responses, investigations, and provide advice to personnel at all levels. Conduct Site Security Assurance audits and manage the Global Security Services contract within the area of responsibility, ensuring high … briefings for management teams and employees, along with other security education materials. Establish and maintain liaison with international and national intelligence and law enforcement agencies to support global security response activities, including providing a 24/7 security response service as necessary. Identify threats to the business units, assess risks, and provide proportionate advice to effectively manage these … risks. Candidate Requirements This role requires a strong understanding of corporate security procedures, risk management methodologies, investigations, incident management, crisis and continuity management, operational security threat management, security intelligence, and security technology. Excellent written and oral communication skills, along with strong presentation skills, are essential to support all levels of client leadership with reliable, meaningful information for fact-based More ❯
designing, implementing, and maintaining robust infrastructure to support a fast-paced, mission-critical environment. You'll also collaborate with support and infrastructure teams to ensure seamless operations and rapid incident response. Key Responsibilities: Design and implement strategic plans for network and security infrastructure Lead threat detection and response using tools like SIEM, Sophos MTR, and Splunk Manage Firewalls … VPNs, intrusion detection systems, and endpoint protection Conduct annual penetration testing and remediate vulnerabilities Support disaster recovery planning and incidentresponse Provide BAU support and mentor junior IT staff Tech Stack & Tools: Security: Checkpoint, Cisco ISE, Zscaler, Sophos MDR, Mimecast, Okta, Fortinet Networking: Cisco Catalyst/Nexus, Proxy (Squid/Pfsense), CUCM Infrastructure: VMware, Dell VxRail, Rubrik, RecoverPoint More ❯
Amazon Development Centre Ireland Limited Amazon Central Technical Operations Services (CTOS) maintains high availability for the Amazon Retail Website and is the team that provides the first line of incidentresponse to protect it. We make customer impacting events shorter, less frequent, severe, and impactful by providing large scale incident and response management. The Amazon Retail … this is the team to join. Key job responsibilities • Drive the resolution of large-scale customer impacting issues as part of a globally rotating team • Design, build, and enhance incident detection and management tools • Participate in Agile sprints to evolve business processes and technologies • Create and review documentation, design new standard operating procedures • Identify and troubleshoot recurring platform issues More ❯
days in the office, 2 days from home The Role: Support the delivery of robust information security and privacy practices across global operations. Conduct security risk assessments, support incidentresponse, and contribute to audits and compliance initiatives. Maintain and enhance the firm's ISMS and Business Continuity frameworks. Complete client cyber due diligence and collaborate closely with internal More ❯
In this role you will also act as the first point of contact for security-related incidents, and do other investigative work including malware analysis, email forensics, and other incidentresponse activities. The successful candidate will be a hands-on, technically skilled security professional with experience across a broad range of cybersecurity disciplines (red/purple and blue More ❯
and KRIs to the CISO, stakeholders and ExCo. Support the CISO with the preparation of business cases, proposals and assistance with high impact presentations. Deputise for the CISO during incidentresponse activities, if they are unavailable to perform their duties in the event of a major live incident. Contribute to regional information security budgeting and resource planning to … Development to deliver targeted security training and capability-building programmes across business units in the EMEA region. Act as the regional escalation point for security incidents, coordinating with global incidentresponse teams to ensure timely and effective resolution and post-incident reviews. Support the assessment and monitoring of third-party vendors and partners of business units within More ❯
within complex systems and applications. This role provides strategic direction on project planning, scheduling, methodologies, and activities aligned with capacity and budget considerations. The position also involves ensuring rapid incidentresponse, problem escalation, and effective troubleshooting using multiple system management and problem management tools. Additionally, the Lead will mentor team members, provide technical expertise , and foster best practices. … deadlines, resolve complex technical challenges with innovative approaches, and proactively identify opportunities for process and system improvements. Keep abreast of emerging technologies and industry trends. Oversee change management and incidentresponse activities , including performing root-cause analysis investigations and bug fixes as required . Lead and mentor team members by providing coaching, training, performance evaluations, and fostering a More ❯
Japan, EMEA, and the Americas to align project goals and execution. Ensure solutions meet business needs, quality standards, and regulatory requirements (e.g., MiFID II). Provide operational support and incidentresponse as needed, without compromising strategic initiatives. Qualifications Proficiency in both Japanese and English for effective stakeholder communication. Significant experience in IT within the banking/trading industry More ❯
datacenters to include preventive maintenance, corrective maintenance, and change management. - Vendor management of colocation datacenter services providers to meet or exceed contracted performance SLA's. - Safety, security, and availability incidentresponse, incident management, and incident resolution. - Continuous improvement of operational processes, procedures, methods, and tools. About the team AWS values diverse experiences. Even if you do More ❯
operations across all FX trading platforms.* Collaborate with development teams to design and deploy fault-tolerant, scalable solutions that align with evolving business goals.* Enforce adherence to change management, incident management, problem management policies as well as specific non-financial risk frameworks required by the organisation.* Mentor junior team members by sharing knowledge and promoting a culture of engineering … for automation demonstrated through hands-on experience building tools that reduce manual intervention while mitigating operational risks within large-scale environments.* Experience enforcing governance frameworks related to change management, incidentresponse or problem resolution within regulated industries adds significant value.* A collaborative approach that fosters teamwork across diverse groups including developers, traders and other business partners is vital More ❯
This is a complex environment, as you will own the most critical part of the customer experience and deliver on our customers' most basic need. While we obsess over incidentresponse, in this role you will also develop tools to scale our service quality, and provide critical input for product prioritization to address root causes of why the … customer experienced an incident in the first place. Our advertising customers are likely Amazon customers, and we take seriously maintaining the high customer service bar set by Amazon. Key job responsibilities - Independently handle complex customer issues by reproducing cases, root cause analysis, and providing prioritization input - Demonstrate deep technical expertise and advanced problem-solving for critical programmatic advertising issues More ❯
support team to drive continuous improvement in service delivery quality. Provide professional insights into AC/DC charging technologies , including fault diagnosis and issue analysis. Lead maintenance process optimization, incidentresponse mechanisms, and standardization of service workflows. Act as a coordination and technical interface in major service issues, ensuring efficient problem resolution for customers. Service Operations Support Support … the development and optimization of preventive maintenance, troubleshooting, and spare parts management processes. Monitor and promote the execution of Service Level Agreements (SLAs) to improve response times and customer satisfaction. Work with customer success, sales teams, and third-party service providers to ensure consistent service delivery. Responsible for the selection, onboarding, and management of service partners , ensuring their competence … requirements and coordinate resources to ensure efficient closure of technical and service issues. Remote Monitoring & Fault Management Collaboration Coordinate with remote monitoring teams to enhance proactive alerting and issue response mechanisms. Support the application and advancement of remote diagnostics and predictive maintenance capabilities. Qualifications & Requirements Education & Experience Bachelor's degree or above in Electrical Engineering, Mechanical Engineering, or a More ❯
support team to drive continuous improvement in service delivery quality. Provide professional insights into AC/DC charging technologies , including fault diagnosis and issue analysis. Lead maintenance process optimization, incidentresponse mechanisms, and standardization of service workflows. Act as a coordination and technical interface in major service issues, ensuring efficient problem resolution for customers. Service Operations Support Support … the development and optimization of preventive maintenance, troubleshooting, and spare parts management processes. Monitor and promote the execution of Service Level Agreements (SLAs) to improve response times and customer satisfaction. Work with customer success, sales teams, and third-party service providers to ensure consistent service delivery. Responsible for the selection, onboarding, and management of service partners , ensuring their competence … requirements and coordinate resources to ensure efficient closure of technical and service issues. Remote Monitoring & Fault Management Collaboration Coordinate with remote monitoring teams to enhance proactive alerting and issue response mechanisms. Support the application and advancement of remote diagnostics and predictive maintenance capabilities. Qualifications & Requirements Education & Experience Bachelor's degree or above in Electrical Engineering, Mechanical Engineering, or a More ❯