to-end tests on code commits and pull-requests. • Monitor pipeline health and test results; collaborate with DevOps to optimize build times, parallelize tests, and reduce pipeline flakiness. Result Analysis & RootCause • Analyze test outputs, system logs, and metrics (e.g., via ELK Stack or Prometheus/Grafana) to pinpoint failures and performance regressions. • Lead root-cause … testing activity efficiently. An ISTQB Foundation Certification is a strong asset and shows your commitment to professional testing standards. A key part of this role involves problem investigation and rootcauseanalysis, so strong analytical and communication skills are a must. You'll enjoy working as part of a collaborative team, contributing your insights to improve outcomes More ❯
Bletchley, Buckinghamshire, United Kingdom Hybrid / WFH Options
In Technology Group
with IT and development teams to ensure secure system architecture and application development. Maintain and enhance incident response procedures and disaster recovery plans. Investigate and document security breaches, providing rootcauseanalysis and remediation plans. Conduct security awareness training for staff and ensure compliance with internal policies and regulatory requirements (e.g., FCA, GDPR, ISO 27001). Stay More ❯
Milton Keynes, Buckinghamshire, South East, United Kingdom Hybrid / WFH Options
In Technology Group Limited
with IT and development teams to ensure secure system architecture and application development. Maintain and enhance incident response procedures and disaster recovery plans. Investigate and document security breaches, providing rootcauseanalysis and remediation plans. Conduct security awareness training for staff and ensure compliance with internal policies and regulatory requirements (e.g., FCA, GDPR, ISO 27001). Stay More ❯
New Milton, Hampshire, United Kingdom Hybrid / WFH Options
Appello
infrastructure and cloud services. Deep understanding of SIP, VoIP, VoLTE, STUN, and firewall bridging. Proficiency in Node.js application support and server diagnostics. Hands-on experience using tools for SIP analysis, such as Wireshark, SIP Traces, or packet analysers. Excellent problem-solving and communication skills. Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent experience … Azure Solutions Architect, or AWS equivalent. ITIL Foundation certification THE ROLE Key Responsibilities Advanced Technical Support Resolve complex hardware, software, and network issues escalated from lower-tier support. Conduct rootcauseanalysis and implement long-term solutions. Manage high-impact incidents to ensure minimal business disruption. ️ Server & Application Support Troubleshoot server issues across cloud (AWS), on-premise More ❯
and compliance requirements. • Act as the primary point of contact for internal business units (including Operations, Compliance & Transactional Banking), IT and external vendors, regarding service performance and enhancements. • Lead rootcauseanalysis and resolution of major incidents. Drive problem management to reduce recurring issues and improve service stability. • Manage projects involving any future enhancements or regulatory changes More ❯
technical level to install cyber security product technologies and systems, such as firewalls, end point protection, encryption, VPN, SIEM, PAM, VM etc. Support the Cyber Security Teams to lead rootcauseanalysis of cyber security related incidents to ensure prompt action is taken to prevent incident reoccurrence and strengthen relevant cyber security controls. Provide technical guidance and More ❯
Woking, England, United Kingdom Hybrid / WFH Options
Pyramid Recruitment Ltd
customer support. Key Responsibilities: Customer Support & Issue Resolution – Investigate and resolve escalated support tickets, meeting SLA targets. Communicate effectively with customers via email, phone, and support portal. Technical Troubleshooting & Analysis – Perform rootcauseanalysis using SQL, application logs, and API integrations to identify and resolve system issues. User Acceptance Testing (UAT) – Support customers during UAT cycles … Cloud & SaaS Technologies (Microsoft Azure, Office 365, Azure AD, MFA). PowerShell, Bash scripting , and basic Python/JavaScript for automation. Experience with ITSM tools (Jira), ITIL framework, and rootcauseanalysis . What's on Offer: Competitive Salary – Up to £35,000 Hybrid Working 25 days holiday (rising to 30) + Birthday Day + Bank Holidays More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
Aztec
Oversee technology issues management and risk acceptance processes. Lead on the 2LoD review of material Technology Incidents and Risk Events ensuring that actual/potential losses, fix details and rootcauseanalysis is reported in a timely and accurate manner within risk governance. Strategic challenge of 1LoD identification and evaluation of risks associated with technology regulatory change … of mitigation strategies. Escalate material technology risks and issues within the Chief Risk Office and to wider risk governance and recommend appropriate mitigation. Provide insightful data driven technology risk analysis to support risk-based decision-making. Report emerging technology risks within risk governance as part of integrated risk reporting. Provide subject matter expertise on emerging technology risks, including cloud … as ITIL, COBIT, NIST, ISO. Demonstrable extensive relevant experience of technology and change/operational risk in either a 1LoD or 2LoD capacity (2LoD preferable). Experience in scenario analysis and resilience impact assessments would be advantageous. Core skills and competencies A strong working knowledge of Microsoft products including Excel and Word, strong analytical skills and ability to provide More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Tate Recruitment
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
on uptime, resilience, and cost-efficiency. Perform performance tuning, system optimisation, and capacity planning to ensure infrastructure reliability and maintainability. Engage in Major Incident and Problem Management, including conducting RootCauseAnalysis (RCA) and implementing long-term solutions. Conduct regular reviews across client estates, proactively identifying and addressing potential risks or inefficiencies. Develop and maintain comprehensive system More ❯
with ERP systems and their process integration. Good overall knowledge and experience of the SC/OtC business processes. Excellent understanding of EDI systems. Capable in problem solving and rootcause analysis. Continuous Improvement mindset. Data analytics and reporting skills (Analytical). Strong written and verbal communication. Excellent MS Office skills, specifically Excel. Ability to Lead and embrace More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Explore Group
and scale Kubernetes clusters hosting critical microservices Design and enhance observability, alerting, and incident response processes Collaborate closely with engineers to ensure systems are reliable, secure, and performant Lead rootcauseanalysis for production incidents and help prevent recurrence Build tooling to automate repetitive tasks and improve deployment pipelines (CI/CD) Participate in on-call rotation More ❯
prem environments. What You’ll Be Doing: Managing and supporting Solace PubSub+ appliances and software brokers across cloud and on-prem platforms Responding to production incidents and working on rootcauseanalysis and long-term fixes Monitoring system health and performance with Prometheus, Grafana, and custom dashboards Optimising Solace across WAN environments for secure, low-latency message More ❯
operational performance, and security compliance. Facilitate effective communication between IT teams and business units. Problem Solving and Incident Management: Manage and resolve high-priority incidents and critical issues. Conduct rootcauseanalysis and implement corrective actions to prevent recurrence. Develop and maintain incident response plans and procedures. Requirements: Proven experience as a Digital Operations Manager, IT Manager More ❯
Portsmouth, Hampshire, South East, United Kingdom Hybrid / WFH Options
Sopra Steria Limited
not limited to Cisco Routing, Switching, Security, SDN, Unified Communications and Wireless technologies). Identify and explore opportunities for enhancing efficiency, leveraging orchestration technologies to streamline and automate. Lead 'RootCauseAnalysis' investigations into network faults, security and performance issues. Support the Principal NetOps Engineer and Architects with project implementation. Liaise with third party service providers for More ❯
Document as you go - to support colleaguesfollow what's been done and why Drive tasks forward with energy and enthusiasm Create proactive monitoring solutions using standard tooling Conduct RCA (RootCauseAnalysis) for incidents Develop and maintain self-managing infrastructure services and dashboards Define the metrics of success and report on progress Implement Infrastructure as Code for … automation tools (Puppet, Ansible, Git), pipelines (Azure DevOps) and test automation Experience with CI/CD tooling (Azure DevOps) Comfortable with Elasticsearch log standardisation, Kibana dashboard creation and data analysis skills AWS hands on - Cloud formation, Route53, S3, DynamoDB, Cloud-watch, Lambda, Security, and troubleshooting, Azure experience also useful Certificate management and automation Strong troubleshooting and diagnosis skills Able More ❯
maintain project plans, schedules, and budgets. Manage & control the project costs & financial performance, including approval of Timesheet. Facilitate stakeholder meetings to align project goals and address concerns proactively. Conduct rootcauseanalysis for issues and propose corrective actions. Oversee project scope, risks, and changes, ensuring alignment with project objectives. Prepare and present status reports to clients and … feedback constructively. Required Skills Strong leadership and problem-solving abilities. Excellent communication and interpersonal skills. Proficiency in project management tools and techniques. Ability to work independently with some oversight. Rootcauseanalysis and continuous improvement mindset. Preferred Skills A recognised project management certification, such as CAPM, Prince2, or APM, is preferred. Demonstrable understanding of both Agile and More ❯
Reading, Berkshire, United Kingdom Hybrid / WFH Options
DCL
escalations Conduct advanced threat hunting using the Microsoft Security Stack. Build, optimise and maintain workbooks, rules, analytics etc. Correlate data across Microsoft 365 Defender, Azure Defender and Sentinel. Perform rootcauseanalysis and post-incident reporting. Aid in mentoring and upskilling Level 1 and 2 SOC analysts. Required Skills & Experience: The ability to achieve UK Security Clearance More ❯
a hands-on leadership role - you won’t just guide others, you’ll be the go-to expert when systems are under pressure. You'll lead incident response, own rootcauseanalysis, and solve performance issues like memory leaks, outages, and flaky services. Your focus will include : Leading incident management, post-mortems, and blameless RCAs Building scalable More ❯
improvement projects as per company's programs and needs. In this context, the QA Engineer is involved in the qualification process, responsible for quality improvement actions, supporting with data analysis and reporting. To succeed in this mission, the Quality Engineer needs to build collaborative links internally with Engineering, Supply Chain, Quality and Production departments. Key Roles & Responsibilities Contribute to … level determined for internal KPI's, plan and monitor appropriate actions and propose improvement in the processes. Ensure effective implementation of corrective and preventive actions internally by supporting on rootcauseanalysis and effectiveness check of corrective actions. Provide the necessary data to enable continuous improvement actions as needed. Ensure proper quality procedures are implemented (proper controls … frequency and tools) and monitor results. Manage projects and key indicators by creating suitable data analysis and report findings based on statistical evidence and trending. Qualifications At least 2 years of experience in manufacturing. Experience in Quality Assurance and Quality audits. Candidate with experience from the Smart Card industry will be a plus. Education: Bachelor's Degree in an More ❯
as well as training users in these systems and the use of reports. This position is responsible for extracting data from multiple sources, manipulating and validating data, and conducting rootcauseanalysis and will also present analytic findings. They play an essential role in presenting operational solutions and recommendations to leadership. This involves gathering requirements, drawing insights … collaboratively in a cross-functional team, learns from colleagues, and provides routine updates on calls related to projects What we are looking for: Required Skills: • Experience with systems functional analysis, technology business analysis, and basic understanding of the different technical platforms, databases, and related technologies •Advanced knowledge of MS SQL Server, Tableau, MS Excel (functions and formulas) and More ❯
playbooks, and escalation procedures. Incident Response & Threat Intelligence Own the full life cycle of security incidents from detection to remediation and post-incident review. Perform advanced threat hunting and rootcauseanalysis across cloud workloads, Kubernetes clusters, APIs, and user activity. Integrate external threat intelligence feeds, aligning TTPs with the MITRE ATT&CK framework. Drive continuous improvement More ❯
playbooks, and escalation procedures. Incident Response & Threat Intelligence Own the full life cycle of security incidents from detection to remediation and post-incident review. Perform advanced threat hunting and rootcauseanalysis across cloud workloads, Kubernetes clusters, APIs, and user activity. Integrate external threat intelligence feeds, aligning TTPs with the MITRE ATT&CK framework. Drive continuous improvement More ❯
playbooks, and escalation procedures. Incident Response & Threat Intelligence Own the full life cycle of security incidents from detection to remediation and post-incident review. Perform advanced threat hunting and rootcauseanalysis across cloud workloads, Kubernetes clusters, APIs, and user activity. Integrate external threat intelligence feeds, aligning TTPs with the MITRE ATT&CK framework. Drive continuous improvement More ❯
playbooks, and escalation procedures. Incident Response & Threat Intelligence Own the full life cycle of security incidents from detection to remediation and post-incident review. Perform advanced threat hunting and rootcauseanalysis across cloud workloads, Kubernetes clusters, APIs, and user activity. Integrate external threat intelligence feeds, aligning TTPs with the MITRE ATT&CK framework. Drive continuous improvement More ❯