Permanent Root Cause Analysis Jobs in London

1 to 25 of 308 Permanent Root Cause Analysis Jobs in London

Reporting Analyst

London, United Kingdom
Aquent
Customer Experience & Operational Insights Analyse voice of customer data (e.g., tNPS, verbatims) and operational metrics (e.g., handle times, wait times, contact drivers). Conduct root cause analysis to identify drivers of customer dissatisfaction and inefficiencies. Provide prioritised, data-backed recommendations, focusing on high-volume or high-value … partner performance in up-sell and cross-sell during support interactions. Assess site- and agent-level effectiveness in identifying and passing sales leads. Deliver root cause analysis and strategic recommendations to drive improvements. Share findings through clear artefacts to support regional sales growth. 3. AI & Process Automation … to boost insight generation. Partner with teams to ensure adoption and business alignment. Provide documentation and training on AI-enabled workflows. Key Skills Data Analysis & Interpretation: Strong ability to analyse customer and operational data for insights. Problem Solving: Skilled in root cause analysis and issue resolution. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior / Lead Systems Design Engineer (Glasgow)

London, UK
Solutions Driven
that spans across the system design to become a vital expert in the laser system. PRIMARY DUTIES & RESPONSIBILITIES. Ownership of control system design reliability analysis, including documented failure mode analysis. Analysis of system adherence to safety standards. Critically reviewing design choices with the needs of all stakeholders considered … Working with prototype systems and critically looking for any improvements through both empirical and paper-based detailed analysis. Leading or actively being involved in root cause analysis of any failures that occur in the early manufacturing cycle. Working closely with team members, especially optical, software and electronics … the optical teams. Owning a synoptic view of the system design whilst being part of the teams that perform the detailed developments. Troubleshooting and root cause failure analysis for prototype units through the development process. Contributing to troubleshooting and coordinating corrective actions for manufacturing or field issues More ❯
Posted:

Site Reliability Engineer

London Area, United Kingdom
IGT Solutions
Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger events … event catalog for the relevant product or application. Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring. Perform incident response and root cause analysis for critical system failures. Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs). Collaborate with development … products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident management teams, operations experts and collaborate with different Service Operations More ❯
Posted:

Site Reliability Engineer

london, south east england, United Kingdom
IGT Solutions
Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger events … event catalog for the relevant product or application. Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring. Perform incident response and root cause analysis for critical system failures. Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs). Collaborate with development … products & drive plan till successful closure Accountable for the in scope product to ensure high availability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident management teams, operations experts and collaborate with different Service Operations More ❯
Posted:

Sr. Business Analyst Manager

London, United Kingdom
Amazon
KPIs supporting business Design and develop highly available dashboards and metrics using SQL and Excel/Quicksight or other BI reporting tools Perform business analysis and data queries using scripting languages like R, Python etc Design, implement and support end-to-end analytical solutions that are highly available, reliable … secure, and scale economically Collaborate cross-functionally to recognize and help adopt best practices in reporting and analysis, data integrity, test design, analysis, validation, and documentation Proactively identify problems and opportunities and perform root cause analysis/diagnosis leading to significant business impact Work closely … visualization tools like Tableau, Quicksight or similar tools - Experience with R, Python or other statistical/machine learning tools - Experience demonstrating problem solving and root cause analysis - Experience using databases with a large-scale data set - Bachelor's degree in engineering, analytics, mathematics, statistics or a related More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

L3 Data Support Engineer

London, United Kingdom
Tcr International
system failures, and performance issues and leverage advanced scripting and orchestration tools (e.g., Python, Bash, Apache Airflow) to automate workflows and reduce operational overhead. Root Cause Analysis & Incident Management : Lead post-incident reviews, perform root cause analysis for data disruptions, and implement corrective actions More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Application Support Manager

London, United Kingdom
Just Group plc
Own Application Support Lifecycle: Ensure end-to-end support for critical business applications, meeting SLAs and availability targets. Incident & Problem Management: Lead resolution and root cause analysis for all Retail application incidents, including major (P1/P2) issues. Escalation & Crisis Leadership: Act as the escalation point for … and AKS . Familiarity with modern web technologies, including React , REST APIs , and SOAP architectures. Skilled in managing P1/P2 incidents , business impact analysis, root cause investigations, and change coordination. Strong grasp of IT service management practices; ITIL v4 certification or equivalent preferred. Proactive Monitoring : Hands More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Data Analyst - Pricing Data Engineering & Automation, CUO Global Pricing

London, United Kingdom
Hybrid / WFH Options
Allianz Popular SL
within the Pricing Function and across other Allianz Commercial functions. Some of your specific responsibilities could include: Drive best practices for data quality and root cause analysis across the team, and coach and mentor junior team members. Act as a bridge between data teams and other stakeholders … of results-driven collaboration, support and respect. What You'll Bring to the Role Approx. 8 years' experience using SQL or Python for data analysis, with about 3 years' experience in P&C insurance. A degree at BSc or MSc level in a Numerical field, preferably with a strong … in analysing, debugging and solving highly complex problems. Experience in coaching and mentoring team members of varying functions and levels in data quality and root cause analysis techniques. Experience using Power BI, Tableau, or similar tools for data visualisation and anomaly detection. Knowledge of P&C insurance More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior SecOps Analyst

London, United Kingdom
Hybrid / WFH Options
IG Index Limited
events within IG. The team's goals are to ensure that security incidents adversely affecting the business are quickly diagnosed, workarounds are determined, proper root cause analysis is performed, and actions are taken to prevent the issue from reoccurring. The Security Operations function is vital to the … and accurate logs are made of all actions during incident response. Support and mentor colleagues with best-practice incident management techniques and behaviours. Perform root cause analysis, recommend process improvements, and write final post-incident reports. Project Delivery Take part in the team's project delivery initiative More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Complaints Manager

London, United Kingdom
Hybrid / WFH Options
Abound
leadership role, supporting and guiding our operational team, helping to build capability, and developing enhanced reporting frameworks that provide deep insights into trends and root causes. You'll be central to evolving our data-driven approach, using reporting to inform proactive improvements across our customer journey and operational performance. … financial regulatory frameworks. Problem-Solver: Proven ability to dissect complex cases, conduct thorough investigations, and apply sound judgement to drive fair customer outcomes. Reporting & Analysis: Experience in building, enhancing, and interpreting operational and complaints MI, spotting trends and making data-led recommendations. Communicator: Outstanding written and verbal communication skills … manage sensitive and high-pressure conversations both internally and externally. Analytical: Strong analytical mindset with excellent attention to detail, capable of identifying patterns, conducting root cause analysis, and implementing improvements. What you'll be doing Team Leadership & Support: Act as a senior figure within the operational team More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, United Kingdom
Mistral AI
alerting systems) for both our client-facing APIs and large training runs. Participate occasionally in on-call rotations to respond to incidents and perform root cause analysis to prevent future occurrences. Development (50%) Drive continuous improvement in infrastructure automation, deployment, and orchestration using tools like Kubernetes, Flux … a DevOps/SRE role. Strong experience with cloud computing and highly available distributed systems. Exposure to site reliability issues in critical environments (issue root cause analysis, in-production troubleshooting, on-call rotations ). Experience working against reliability KPIs (observability, alerting, SLAs). Hands-on experience with More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Data Centre Site Lead NSC

London, United Kingdom
Ll Oefentherapie
Data Centers within your designated region. Conduct audits for power and mechanical capacity and oversee upgrades. Collaborate with internal teams to troubleshoot and perform Root Cause Analysis (RCA) and Corrective Action (CA) for design-related issues. Liaise with local colocation partners to comprehend and synchronize site utility … center equipment. Collaborate with project teams and colocation partners to validate the functionality of electrical and mechanical systems. Extend operational support encompassing failure mode analysis, root cause identification, maintenance assistance, best practices, procedural reviews, and more. Curate and maintain comprehensive technical documentation concerning corporate data centers and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Data Platform Support Lead

London
Hybrid / WFH Options
RSA
pipelines, Databricks workflows, and SQL databases to ensure seamless data processing. You will troubleshoot and resolve production incidents in Azure-based data pipelines, conducting root cause analysis and implementing preventive measures. You will oversee and optimize the performance of Databricks notebooks and clusters to support efficient data … optimizing queries for performance. You will have experience with Azure Monitor, Log Analytics, and issue detection. You will have good problem-solving skills for root cause analysis and implementing preventive solutions. You will have knowledge of compliance standards, role-based access control (RBAC), and secure Azure networking. More ❯
Employment Type: Permanent
Posted:

Site Reliability Engineer (London)

London, UK
Selby Jennings
of critical data systems that drive the hedge fund's trading strategies. You will collaborate with cross-functional teams to optimise system performance, perform root cause analysis, and drive remediation efforts to resolve production incidents. Key Responsibilities: Develop and implement automated solutions to streamline the management of … systems. Troubleshoot and resolve production issues related to Enterprise and Reference Data applications, ensuring minimal downtime and maximum system availability. Conduct post-issue evaluations, root-cause analysis, and remediation efforts to prevent future incidents. Analyse system performance, identify failure patterns, and create performance tests to improve the More ❯
Posted:

Site Reliability Engineer

London, England, United Kingdom
Selby Jennings
of critical data systems that drive the hedge fund's trading strategies. You will collaborate with cross-functional teams to optimise system performance, perform root cause analysis, and drive remediation efforts to resolve production incidents. Key Responsibilities: Develop and implement automated solutions to streamline the management of … systems. Troubleshoot and resolve production issues related to Enterprise and Reference Data applications, ensuring minimal downtime and maximum system availability. Conduct post-issue evaluations, root-cause analysis, and remediation efforts to prevent future incidents. Analyse system performance, identify failure patterns, and create performance tests to improve the More ❯
Posted:

Site Reliability Engineer

london, south east england, United Kingdom
Selby Jennings
of critical data systems that drive the hedge fund's trading strategies. You will collaborate with cross-functional teams to optimise system performance, perform root cause analysis, and drive remediation efforts to resolve production incidents. Key Responsibilities: Develop and implement automated solutions to streamline the management of … systems. Troubleshoot and resolve production issues related to Enterprise and Reference Data applications, ensuring minimal downtime and maximum system availability. Conduct post-issue evaluations, root-cause analysis, and remediation efforts to prevent future incidents. Analyse system performance, identify failure patterns, and create performance tests to improve the More ❯
Posted:

Electrical Engineering Manager - SME Leader - Data Centres

Greater London, England, United Kingdom
PRS
time management and quality delivery from different stakeholders Client-oriented strong stakeholder management, providing frequent updates (externally, internally) to ensure continuous clarity Conduct thorough root-cause analysis and implement lessons learned to drive continuous improvement Involvement in training development for electrical systems, ensuring adherence to company standards … desirable Proven track record of electrical optimization and efficiency improvements, preferably in Data Centre’s Excellent stakeholder management and communication skills Ability to conduct root-cause analysis and implement lessons learned Experience in developing and delivering training programs Familiarity with SOP/EOP development and the CAB More ❯
Posted:

Support Lead

London, United Kingdom
InvestorFlow, Inc
complex or high-priority support issues within the UK. Troubleshoot and resolve client issues across InvestorFlow's Pulse, Portfolio, and Pipe product lines. Conduct root cause analysis and escalate bugs, enhancements, and outages to the appropriate internal teams. Monitor and drive regional support KPIs (e.g., SLA adherence … . Proven experience in a client-facing technical support role with demonstrated ability to mentor peers or junior team members. Strong diagnostic, debugging, and root cause analysis skills. Excellent written and verbal communication skills in English. Familiarity with knowledge management and internal training best practices. Ability to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Solutions Engineer

west london, south east england, United Kingdom
DP World
to business made by individuals and function as a whole Act as a “trouble shooter” when required, identify pinch points or problem areas, ensure root cause analysis is completed where appropriate and corrective measures are implemented into operations Able to work independently and under own initiative on … engineering management and local plant management Support in the training and developing of plant-based engineering teams and embed regional standards within sites. Complete root cause analysis where appropriate and identify corrective measures Be a SME (subject mater expert) on engineering processes, tools and techniques including MTM More ❯
Posted:

Solutions Engineer

south west london, south east england, United Kingdom
DP World
to business made by individuals and function as a whole Act as a “trouble shooter” when required, identify pinch points or problem areas, ensure root cause analysis is completed where appropriate and corrective measures are implemented into operations Able to work independently and under own initiative on … engineering management and local plant management Support in the training and developing of plant-based engineering teams and embed regional standards within sites. Complete root cause analysis where appropriate and identify corrective measures Be a SME (subject mater expert) on engineering processes, tools and techniques including MTM More ❯
Posted:

System Dev Engineer - II, TSE Ops Tech

London, United Kingdom
Amazon
working in collaboration with an engineering team to provide operational support for multiple products and platforms, including engineering development support (continuous deployment, operational readiness, root-cause analysis, code fixes, unit and integration test coverage, metrics and dashboards), customer support self-service (tools development), and business decision-making … Keep software up-to-date based on software deprecation campaigns, modernization initiatives, and help identify and improve tech-debt scenarios. • Perform comprehensive troubleshooting and root cause analysis for technical challenges. Software Development and Maintenance • Develop and implement operational tools and automation solutions using Ruby, Rails, Java, Python More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Schneider BMS Engineer (Control Systems / Schneider)

Sidcup, England, United Kingdom
Ernest Gordon Recruitment
in a design and installation focussed BMS role. This role offers the chance for extensive career progression within the company. You will be conducting root cause analysis to identify issues in existing designs, as well as designing your own complex systems. You will be integrating systems on … and develop complex systems, meeting specific requirements and performance criteria. Utilise SolidWorks and CAD to create detailed schematic models of components and systems. Conduct root cause analysis to identify issues within existing designs and processes, providing solutions. Collaborate with cross-functional teams to integrate System on Chip More ❯
Posted:

Project Controls Manager / Power BI

London, United Kingdom
Hybrid / WFH Options
Cooper Moss Rutland LLP
develop and manage insightful dashboards and reports, monitor key performance indicators, and present actionable insights to stakeholders. Your role will also involve predictive analytics, root-cause analysis, risk management collaboration, and ensuring compliance with industry standards. Proficiency in tools like Power BI, SQL, and Python, along with … performance indicators and identify trends, deviations, and improvement opportunities. Present clear and actionable insights to stakeholders to enable effective project control decisions. Data Modelling & Analysis: Develop predictive analytics models to assess potential project outcomes based on current data and trends. Conduct root-cause analyses of project variances … with project controls disciplines such as cost control, scheduling, and risk management. Understanding of Earned Value Management (EVM) and techniques like cost forecasting, variance analysis, and benchmarking. About You Essential Bachelor's degree in data science, Engineering, Project Controls, Finance, or a related discipline. Professional certifications (e.g., AACE, PMI More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Amazon Technologies Systems Engineer, Reliability and Automation Engineering Team (RAE)

London, United Kingdom
Amazon
system optimizations, reliability metrics, and overall equipment effectiveness. Utilizing this data, you will engage customers to understand and document business requirements, drive problems to root cause, and manage implementation programs for corrective actions. You will apply your expertise in robotics, mechatronics, reliability engineering and system lifecycle management to … ensure optimal performance and availability of thousands of workcells across Amazon's global fulfillment center network. Key job responsibilities Primary responsibilities: - Utilize data analytics, root cause analysis, design of experiments and six sigma methods to improve reliability and availability of critical automation equipment and robotics & mechatronics systems. … deliver structured problem solving. - Design, pilot and implement new preventative maintenance standards, interval adjustments, part replacement cycles, and work instruction streamlining based on granular analysis of reliability (MTBF, MTTR, OEE, etc.) trends across assets. - Develop automated reporting through advanced SQL and Quicksight to provide RME leadership with actionable insights More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Site Reliability Engineer (Surrey)

London, UK
Blackfield Associates
implement scalable, efficient systems for maximum reliability. Lead incident response and implement monitoring solutions to maintain high system uptime. Optimize performance through in-depth analysis and continuous improvement. Develop preventive maintenance programs and carry out Root Cause Analysis (RCA) to eliminate recurring issues. Collaborate with cross … reduce manual tasks. Ensure compliance with regulatory standards and uphold best security practices. Contribute to quality systems through deviation management, CAPA follow-up, and root cause investigations. What We’re Looking For: 5+ years of experience in Site Reliability Engineering or a related field. Hands-on experience with More ❯
Posted:
Root Cause Analysis
London
10th Percentile
£45,000
25th Percentile
£50,000
Median
£65,000
75th Percentile
£83,750
90th Percentile
£117,500