SLOs, and error budgets for critical systems. -Monitor system performance, diagnose issues, and implement long-term fixes. Incident Response & Prevention -Coordinate high-impact incident response efforts and postmortems. -Drive rootcauseanalysis and long-term improvements. Tooling & Automation -Build and enhance internal tooling to improve deployment, monitoring, and reliability. -Implement infrastructure as code and CI/CD More ❯
checks to identify process defects Reporting Support the creation of routine reporting packs and dashboards for internal stakeholders, utilising and defining performance metrics - Service Level Agreements (SLAs) etc Conduct Analysis utilising tools such as Excel or PowerBI, to identify trends and opportunities for both system optimisation and improvement in operational performance Continuous Improvement - Operations process optimisation Proactively identify opportunities … generating and maintaining a knowledgeable Problem Solving Critically assess and collaboratively work alongside the function's operations team, managed service vendors and enterprise IT team to identify/support rootcauseanalysis and remediation of issues, incidents and escalation. Bridge the gap by translating business requirements to the Tech team and vice versa Vendor Management Maintain a More ❯
Accrington, England, United Kingdom Hybrid / WFH Options
World Options Ltd
governance across the UK operations and ensuring that every technology investment delivers tangible, measurable benefits that positively impact revenue, margin, and EBITDA. Key Responsibilities Requirements Management: Lead the collection, analysis, and prioritisation of functional and non-functional requirements across the three UK business units. Translate approved requirements into clear user stories, detailed acceptance criteria, and well-defined delivery plans … IT Manager. Establish and monitor effective Service Level Agreements (SLAs) and Operational Level Agreements (OLAs), curate a comprehensive knowledge base, measure user satisfaction (CSAT, NPS), and drive thorough incident root-cause analysis. Stakeholder Engagement & Communication: Act as a trusted advisor and key liaison for UK franchise partners, country management, and functional leads. Produce clear, data-driven status reports … UK IT Manager & Help Desk Team Development partners (internal & external) supporting UK systems UK Franchise partners & store owners Skills & Experience Proven track record of 7+ years in IT business analysis, product ownership, or IT governance roles, ideally within multi-site or franchise organisations operating in the UK. Demonstrable success in managing technology initiatives within complex, multi-platform environments (experience More ❯
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tate Professional
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
Be proficient in Linux server and system administration (e.g., package management, kernel updates, filesystems, volume management) Have experience managing containerized workloads using Docker or Kubernetes Be an expert in RootCauseAnalysis Have a strong desire to learn new skills and technologies, with proven research capabilities and adaptability Possess at least two years of experience training and More ❯
links and/or collaborating with agency stakeholders. Assist in system design, development, and implementation. Install, configure and maintain hardware and software. Analyze, resolve issues, and determine/provide rootcauseanalysis with details on resolution/restoration, Provide cybersecurity support and documentation for information/operational technology and/or telecommunications systems to obtain favorable assessments More ❯
operational performance, and security compliance. Facilitate effective communication between IT teams and business units. Problem Solving and Incident Management: Manage and resolve high-priority incidents and critical issues. Conduct rootcauseanalysis and implement corrective actions to prevent recurrence. Develop and maintain incident response plans and procedures. Requirements: Proven experience as a Digital Operations Manager, IT Manager More ❯
Waterwells Business Park, Quedgeley, Gloucester, Gloucestershire, England, United Kingdom Hybrid / WFH Options
IMT Resourcing Solutions
and infrastructure (RMS, mobile and CAD platforms). Key Responsibilities Validate, cleanse and enrich large, operational datasets; fix anomalies before they hit production. Profile data, uncover patterns and perform root-causeanalysis using T-SQL and BI/visualisation tools. Own data-quality KPIs (completeness, accuracy, timeliness) and present clear insights to stakeholders. Maintain data dictionaries, quality More ❯
Salisbury, Wiltshire, United Kingdom Hybrid / WFH Options
Sopra Steria Group
not limited to Cisco Routing, Switching, Security, SDN, Unified Communications and Wireless technologies). Identify and explore opportunities for enhancing efficiency, leveraging orchestration technologies to streamline and automate. Lead 'RootCauseAnalysis' investigations into network faults, security and performance issues. Support the Principal NetOps Engineer and Architects with project implementation. Liaise with third party service providers for More ❯
Portsmouth, Hampshire, South East, United Kingdom Hybrid / WFH Options
Sopra Steria Limited
not limited to Cisco Routing, Switching, Security, SDN, Unified Communications and Wireless technologies). Identify and explore opportunities for enhancing efficiency, leveraging orchestration technologies to streamline and automate. Lead 'RootCauseAnalysis' investigations into network faults, security and performance issues. Support the Principal NetOps Engineer and Architects with project implementation. Liaise with third party service providers for More ❯
leaders both within Leidos and CBP. Assist in change management of container platform for new versions of OpenShift, Hotfixes, SysAdmin tasks, etc. Works with experienced team members to conduct rootcauseanalysis of issues, review new and existing code and/or perform unit testing. Authors/edits system documentation/playbook(s) Partners with experienced team More ❯
Document as you go - to support colleaguesfollow what's been done and why Drive tasks forward with energy and enthusiasm Create proactive monitoring solutions using standard tooling Conduct RCA (RootCauseAnalysis) for incidents Develop and maintain self-managing infrastructure services and dashboards Define the metrics of success and report on progress Implement Infrastructure as Code for … automation tools (Puppet, Ansible, Git), pipelines (Azure DevOps) and test automation Experience with CI/CD tooling (Azure DevOps) Comfortable with Elasticsearch log standardisation, Kibana dashboard creation and data analysis skills AWS hands on - Cloud formation, Route53, S3, DynamoDB, Cloud-watch, Lambda, Security, and troubleshooting, Azure experience also useful Certificate management and automation Strong troubleshooting and diagnosis skills Able More ❯
and non-technical staff across numerous areas. * Proven ability to work independently on multiple tasks with commitment and willingness to see issues through to resolution * Excellent problem solving and RootCauseAnalysis skills * Proficiency in understanding, analysing and defining corrective actions any tickets raised by users * Understanding of virtualization and environments ability to understand Intune administration * Knowledge More ❯
and non-technical staff across numerous areas. * Proven ability to work independently on multiple tasks with commitment and willingness to see issues through to resolution * Excellent problem solving and RootCauseAnalysis skills * Proficiency in understanding, analysing and defining corrective actions any tickets raised by users * Understanding of virtualization and environments ability to understand Intune administration * Knowledge More ❯
features. Authors and maintains comprehensive technical documentation including detailed system configurations, governance models, and operational procedures. Acts as a senior escalation point for Level 3/4 support, performing rootcauseanalysis and driving long-term resolution of complex issues. Manages the technical scope, delivery timelines, and risk mitigation strategies for cloud engineering initiatives. Tracks and reports More ❯
and in-depth experience of Oracle Engineered systems and subsystems, especially Exadata Ability to troubleshoot and resolve complex hardware/software issues, restore environments to an operational state, perform rootcauseanalysis and provide forward thinking mitigation strategies Good communication and analytical skills Familiarity with security practices in web application delivery and general knowledge of network topology More ❯
data. Perform regular vulnerability assessments, patch management, and security audits to safeguard infrastructure and prevent unauthorized access. Monitor systems for security incidents, respond to threats, and conduct investigations and rootcauseanalysis to mitigate future risks. Manage relationships with vendors and external Managed Service Providers (MSPs) to ensure timely and effective support. Develop and maintain comprehensive documentation More ❯
Cambridge, Cambridgeshire, East Anglia, United Kingdom
In Technology Group Limited
e.g., SolarWinds, PRTG, Zabbix, or similar). Knowledge of operating systems (Windows Server, Linux) and virtualisation platforms (VMware, Hyper-V). Strong troubleshooting skills and the ability to perform rootcause analysis. Excellent written and verbal communication skills. Willingness to work in a shift-based environment, including evenings, nights, and weekends if required. Desirable: Exposure to cloud platforms More ❯
Monitoring & Observability Build and manage comprehensive monitoring and logging systems for network performance, latency, and availability. Implement observability frameworks using modern tools to provide real-time insight and support root-cause analysis. Collaboration & Project Leadership Act as a key stakeholder in cross-functional teams, working with Infrastructure, Security, DevOps, and Application teams to deliver secure and high-performance More ❯
City of London, Greater London, UK Hybrid / WFH Options
Laser Digital
Monitoring & Observability Build and manage comprehensive monitoring and logging systems for network performance, latency, and availability. Implement observability frameworks using modern tools to provide real-time insight and support root-cause analysis. Collaboration & Project Leadership Act as a key stakeholder in cross-functional teams, working with Infrastructure, Security, DevOps, and Application teams to deliver secure and high-performance More ❯
and peripheral equipment for executives. Mobile device support and advanced troubleshooting skills (Apple & Android technologies). Proactively identify potential technical issues and implement preventive solutions and advanced troubleshooting and rootcause analysis. Liaising with and delegating tasks to relevant teams for escalation. Supporting the Exec Support Specialist and escalating support issues to the Head of IT where necessary. More ❯
and peripheral equipment for executives. Mobile device support and advanced troubleshooting skills (Apple & Android technologies). Proactively identify potential technical issues and implement preventive solutions and advanced troubleshooting and rootcause analysis. Liaising with and delegating tasks to relevant teams for escalation. Supporting the Exec Support Specialist and escalating support issues to the Head of IT where necessary. More ❯
and peripheral equipment for executives. Mobile device support and advanced troubleshooting skills (Apple & Android technologies). Proactively identify potential technical issues and implement preventive solutions and advanced troubleshooting and rootcause analysis. Liaising with and delegating tasks to relevant teams for escalation. Supporting the Exec Support Specialist and escalating support issues to the Head of IT where necessary. More ❯
compliance with organisational level service agreements. Manages the operational Systems Environments affecting the whole health board. Contributes towards ICT Strategy. Works as part of problem management team to determine rootcause analysis. Work with external suppliers to escalate Major Digital Incidents. Works with external Suppliers and contributes towards contracts negotiation and requirements. Person Specification Qualifications Essential Educated to More ❯