SRE principles to ensure reliability, performance, and resilience of the SACM platform. Embed SACM into 24x7 operations and observability platforms to support real-time decision-making. Support incident prevention, rootcauseanalysis, and continuous improvement through data-driven insights. Define and enforce service level objectives (SLOs) and key performance indicators (KPIs) for SACM health and value. Governance More ❯
and ever-changing landscape. The business analyst will be assigned to a number of initiatives within the Core Infrastructure Value Stream.Primarily - engagement of stakeholders, creating documentation, analysing processes, data analysis and data landscapes to enable the initiatives to drive value and deliver positive outcomes for the business. The role will require a flexible individual who can work on a … number of analysis tasks at once, that can be used by the team and presented to various levels of stakeholders.They will need to be able to work at pace in a demanding environment. Responsibilities Support the Product Owner to help inform prioritisation and delivery decisions Perform requirements elicitation, rootcauseanalysis, as-is/to-be … mapping, gap analysis, business case development, backlog stories creation/maintenance Facilitate internal and external stakeholder workshops, building a valuable relationship with our business and technology community Consume and understand complex requirements and turn these into valuable product driven outputs Analyse and document business processes, data flows, and system interactions Collaborate with engineers, testers, and other team members to More ❯
learn more about this opportunity, feel free to reach out and apply today! Responsibilities: Monitor and analyse security events within the SOC, ensuring timely detection and response. Perform threat analysis, vulnerability assessments, and implement mitigation strategies. Develop and refine incident response playbooks and procedures. Conduct rootcauseanalysis (RCA) for high-priority incidents to prevent recurrence. … Have: Minimum of two years' experience in a SOC or managed security environment. Strong knowledge of network security (firewalls, IDS/IPS, VPNs). Proficiency in incident response, threat analysis, and vulnerability management. Experience working with SIEM tools for monitoring and event analysis. Understanding of malware analysis, forensic investigations, and endpoint security. Strong analytical and problem-solving skills. More ❯
our personalized learning opportunities - just to name a few! Job Description Your Career You will work firsthand with our valued customers to address their complex post–sales concerns where analysis of situations or data requires an in–depth evaluation of many factors. You're a critical thinker in understanding the methods, techniques, and evaluation criteria for obtaining results. You … issues via ticketing systems, phone, and remote sessions. Troubleshoot complex problems at both the application and operating system levels using deep technical knowledge and collaboration with internal teams. Identify root causes (code, configuration, or environment), and work with engineering and product teams to deliver permanent solutions. Share insights from customer interactions to improve our product and support experience. Document … troubleshooting steps and resolutions clearly for both internal and customer use. Lead rootcauseanalysis and coordinate corrective actions to prevent recurrence. Qualifications Your Experience Mandatory Requirements 🔒 Due to the nature of this role and the customers we support, candidates must either: Have lived in the UK for the last 5 consecutive years, or Hold British Citizenship More ❯
Management Services and Wealth Loan Processing & Transaction Management functions, contributing to an improvement of controls and a reduction in residual risk Develop and implement robust testing strategies, leveraging data analysis, automation, and AI technologies to enhance the efficiency, coverage, and effectiveness of QA activities Provide credible challenge to front-line credit execution teams and escalate material issues with clear … rootcauseanalysis and risk-based recommendations Deliver high-quality review outputs, communicating findings to senior stakeholders through clear, concise reporting Manage relationships with key stakeholders including Quality Assurance Directors (QADs), Function Heads, Independent Risk, Audit and ICM senior leadership, fostering transparency and collaboration. Lead or support process automation and tooling efforts (e.g., Python, Alteryx, Excel VBA More ❯
Responsibilities: Lead end-to-end incident response investigations and containment efforts Communicate directly with clients during live cyber incidents, offering reassurance and expert guidance Produce detailed incident reports with rootcauseanalysis and actionable recommendations Perform forensic and log analysis using SIEM, EDR, SOAR, and other security tools Collaborate across teams to enhance response playbooks and More ❯
Knowledge Management: Maintain up-to-date technical documentation, including API/interface catalogues, data flow diagrams, environment runbooks, and integration design patterns Incident and Service Request Administration: Assist in rootcauseanalysis for integration-related issues, serving as the primary point of contact for documenting, triaging, and coordinating the resolution of incidents and service requests. Change Coordination … a conduit between the development team and project teams to ensure consistent, transparent, and professional communication Education and Experience: Bachelor's degree in computer science, information-technology, engineering, system analysis or a related study, or equivalent experience A minimum of three years in a technology-related capacity with direct exposure to software development or IT project environments. At least More ❯
reduce manual operational work ("toil") through scripting. Reliability Engineering & Incident Management ( 30%) Monitor health of trading systems with a goal of proactive failure prevention. Own and improve incident response, rootcauseanalysis, and blameless post-mortems. Design and validate failover and disaster recovery strategies. Collaborate with developers to design robust, testable deployment pipelines. Operations & Cross-Team Collaboration … especially under pressure. Soft Skills & Attributes Analytical mindset and curiosity-driven troubleshooting. Calm, decisive demeanor during critical incidents. Empathy for both internal users and downstream systems. Bias toward eliminating root causes rather than treating symptoms. Educational & Professional Qualifications Educated to degree (or equivalent) level or higher, preferably from a leading university. Bachelor's or Master's in Computer Science More ❯
hands-on deep dive when required. Leading the teams during Major Incidents and provide recommendations on fastest path to the major incident recovery or supporting technical delivery teams with rootcauseanalysis for Major Incidents Experience of working on both SAFE/AGILE project delivery Supervisory/Managerial responsibilities (please specify if the position will have persons … SOLVING Uses rigorous logic and methods to solve difficult problems with effective solutions and probes all fruitful sources for answers. Can see hidden problems and is excellent at detailed analysis by looking beyond the obvious and doesnt stop at the first answer. TECHNICAL LEARNING Able to learn new skills quickly and is adept at learning new industry skills and More ❯
Employment Type: Contract
Rate: From £450 to £500 per day Daily rates are within Inside IR35
Stevenage, Hertfordshire, England, United Kingdom Hybrid / WFH Options
MBDA
to review SOC alerting in collaboration with SOC analysts to effectively triage and manage Tier 1 SOC alerts to the appropriate outcome. Experience with LDAP, and application traffic flow rootcause analysis. Previous experience to identify rootcause from (TBC for review - Demonstrable understanding of the OSI Reference Model and the network communication protocols, including but More ❯
operational performance, and security compliance. Facilitate effective communication between IT teams and business units. Problem Solving and Incident Management: Manage and resolve high-priority incidents and critical issues. Conduct rootcauseanalysis and implement corrective actions to prevent recurrence. Develop and maintain incident response plans and procedures. Requirements: Proven experience as a Digital Operations Manager, IT Manager More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Smart DCC
metering community. Translate threat trends into actionable insights and drive improvements across the organisation. Evaluate and recommend tools that enhance detection and response capabilities. Conduct forensic investigations and perform rootcauseanalysis of security incidents. What are we looking for? Proven experience in incident response and leading investigations in complex environments. Strong understanding of the cyber threat More ❯
hands-on deep dive when required. Leading the teams during Major Incidents and provide recommendations on fastest path to the major incident recovery or supporting technical delivery teams with rootcauseanalysis for Major Incidents Experience of working on both SAFE/AGILE project delivery Your Profile Essential skills/knowledge/experience: MS Azure solution architect More ❯
hands-on deep dive when required. Leading the teams during Major Incidents and provide recommendations on fastest path to the major incident recovery or supporting technical delivery teams with rootcauseanalysis for Major Incidents Experience of working on both SAFE/AGILE project delivery Your Profile Essential skills/knowledge/experience: MS Azure solution architect More ❯
Woking, Surrey, United Kingdom Hybrid / WFH Options
Arrow McLaren IndyCar
be key, by using data, analytics, and machine learning to deliver world championship reliability tools. Role Dimensions: The Software & Data Science group in McLaren F1 is responsible for the analysis, design, and delivery of software tools and methodologies which improve the team and car's performance. We are a cross-functional group, bringing together data science, machine learning, software … engineering, and DevOps to deliver performance focused platforms and solutions. In reliability engineering, you will understand issue tracking and management, rootcauseanalysis, integrating with other systems through API's, and will have experience in building complex user interfaces that can present and manage large amounts of data. As a Senior Specialist Software Engineer, your role will … combine elements of technical leadership, agile/lean project delivery, and stakeholder management. You'll be involved in all stages of the development life cycle from initial analysis through deployment, monitoring, and support. You will own systems architecture for the software you deliver, integrating with the wider McLaren F1 racing platform, and will balance the requirements of reliability engineering More ❯
and environment standardization Build and manage CI/CD pipelines with Jenkins, GitHub Actions, or AWS Code Pipeline Perform administrative and troubleshooting tasks on Linux-based systems, including log analysis and performance tuning. Lead technical triage and rootcauseanalysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail More ❯
and environment standardization Build and manage CI/CD pipelines with Jenkins, GitHub Actions, or AWS Code Pipeline Perform administrative and troubleshooting tasks on Linux-based systems, including log analysis and performance tuning. Lead technical triage and rootcauseanalysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail More ❯
Birmingham, Staffordshire, United Kingdom Hybrid / WFH Options
Hogan Lovells
Responsibilities/Accountabilities The Team Providing third-line support to a rich array of high profile legal and information systems which are fundamental to the firm. Proactive management and analysis of the global server estate which encompasses Web, SSO, SharePoint and SQL Server infrastructure, and identify opportunities for improvement. Ensuring the design and architecture of Web, SharePoint, Azure Cloud … the Core Services and Platforms team as defined in the Terms of Reference. To create detailed documentation of all systems and procedures that are relevant to the service including RootCauseAnalysis and Knowledge Base articles to enhance the support of the services provided by the team. Contribute to departmental strategy, manage product lifecycles, help develop and More ❯
Gloucester, Gloucestershire, South West, United Kingdom Hybrid / WFH Options
Queen Square Recruitment Limited
production environments for smooth rollouts Document processes and share knowledge with team members Participate in change management activities, including CAB meetings Provide 2nd/3rd line incident resolution and rootcauseanalysis reports Collaborate with deployment teams to drive continuous improvement Offer on-call and out-of-hours support as required (rota basis) Travel within the UK More ❯
Burton-on-the-wolds, Leicestershire, United Kingdom
Matchtech
ITSM) processes including asset, change, incident, request, problem, and project management to meet service levels. Providing on-site IT support and assisting in resolving broader technical issues. Contributing to rootcauseanalysis and long-term problem management. Acting as a key point of contact between IT and users, promoting standards, improving user satisfaction, and sharing best practices. More ❯
Loughborough, Leicestershire, Burton on the Wolds, United Kingdom
Matchtech
ITSM) processes including asset, change, incident, request, problem, and project management to meet service levels. Providing on-site IT support and assisting in resolving broader technical issues. Contributing to rootcauseanalysis and long-term problem management. Acting as a key point of contact between IT and users, promoting standards, improving user satisfaction, and sharing best practices. More ❯
Compliance and Security Controls: Implement and monitor controls to ensure infrastructure build and release processes meet regulatory and internal compliance requirements. Incident and Problem Management: Oversee incident response and rootcauseanalysis related to build and release operations, ensuring timely resolution and preventative measures. Performance Monitoring and Optimization: Monitor build and release performance metrics and implement optimizations More ❯
Salford, Greater Manchester, North West, United Kingdom Hybrid / WFH Options
AWD Online
and monitoring of disaster recovery solutions and backup strategies Ensure compliance with internal security policies and regulatory requirements (e.g., GDPR, ISO27001, PCI DSS v4.0) Provide 3rd line support and rootcauseanalysis for complex issues Write PowerShell scripts to automate and streamline administrative tasks Document system configurations, changes and standard operating procedures Participate in infrastructure projects, including More ❯
and testing efforts to maintain software quality and performance. Support CI/CD pipelines using Jenkins and contribute to automated testing and deployment. Troubleshoot and resolve production issues, performing rootcauseanalysis and providing timely solutions. Mentor junior engineers and share knowledge across the team to foster a collaborative working environment. Basic Qualifications: Bachelor's degree in More ❯
issues, or when they occur, quickly recover service. Partner with development teams to improve system reliability, observability, and release velocity. Participate in on-call rotations, incident response, postmortems, and rootcauseanalysis and resolution. Be a vocal advocate of strong/sound engineering practices that allow us to build, deploy, and run scalable, reliable, and performant services. More ❯