complex incidents using tools like SCCM, MS Endpoint, Intune, PowerShell, and Active Directory (on-prem and Azure AD). Lead incident and problem management processes, ensuring timely resolution and rootcauseanalysis reporting. Maintain accurate records in ITSM platforms (e.g., ServiceNow, Remedy, HEAT). Support ITIL-aligned service delivery and act as a core member of the More ❯
complex incidents using tools like SCCM, MS Endpoint, Intune, PowerShell, and Active Directory (on-prem and Azure AD). Lead incident and problem management processes, ensuring timely resolution and rootcauseanalysis reporting. Maintain accurate records in ITSM platforms (e.g., ServiceNow, Remedy, HEAT). Support ITIL-aligned service delivery and act as a core member of the More ❯
complex incidents using tools like SCCM, MS Endpoint, Intune, PowerShell, and Active Directory (on-prem and Azure AD). Lead incident and problem management processes, ensuring timely resolution and rootcauseanalysis reporting. Maintain accurate records in ITSM platforms (e.g., ServiceNow, Remedy, HEAT). Support ITIL-aligned service delivery and act as a core member of the More ❯
complex incidents using tools like SCCM, MS Endpoint, Intune, PowerShell, and Active Directory (on-prem and Azure AD). Lead incident and problem management processes, ensuring timely resolution and rootcauseanalysis reporting. Maintain accurate records in ITSM platforms (e.g., ServiceNow, Remedy, HEAT). Support ITIL-aligned service delivery and act as a core member of the More ❯
new and updated system changes Developing, executing, and improving documentation for installation, configuration, hardening, and operations and maintenance tasks Ensuring compliance with IT infrastructure standards, policies, and procedures Conducting rootcauseanalysis and resolving system and application faults and errors Ensuring operating systems and applications comply with Department of Defense (DoD) guidelines, including DISA Security Technical Implementation More ❯
Birmingham, West Midlands, West Midlands (County), United Kingdom
Hunter Selection
implement scalable, resilient, and secure infrastructure solutions aligned to organisational strategy Lead BAU operations across networks, firewalls, hosting platforms, and server endpoints Proactively monitor systems, troubleshoot issues, and conduct rootcauseanalysis Own disaster recovery and business continuity planning, testing, and documentation Act as a subject matter expert on infrastructure and cybersecurity best practice Mentor junior engineers More ❯
Employment Type: Permanent
Salary: £46000 - £55000/annum 33 days holiday, bonus + more
operations and maintenance tasks Document activities, status, and issues worked on Provide input to and follow Configuration Management processes Ensure adherence to IT infrastructure standards, policies, and procedures Perform rootcauseanalysis and resolve system and application faults and errors Maintain working knowledge of Microsoft Active Directory, Group Policy Objects (GPOs), DHCP, DNS, and PowerShell General understanding More ❯
of infrastructure components. 2. Monitoring and Incident Management: - Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues. - Participate in incident response and rootcauseanalysis efforts to drive continuous improvement and prevent future incidents. 3. Reliability and Performance Optimization: - Optimise system performance, reliability, and cost efficiency through continuous monitoring, performance More ❯
Shrivenham, Oxfordshire, United Kingdom Hybrid / WFH Options
Gold Group
Collaborate with engineering teams to support unified access devices (UADs), endpoint management, and virtualized environments. * Provide hands-on support for automation scripts, workflows, and infrastructure monitoring tools. * Contribute to rootcauseanalysis efforts for recurring platform incidents. * Support capacity planning and performance optimization by analysing system usage and trends. * Offer feedback on tools and processes, identifying improvements More ❯
and test network engineering/administration activities. • Create and maintain Standard Operating Procedures (SOPs) and technical documentation. • Provide follow-up reports (technical findings, feedback, and resolution steps taken) for RootCauseAnalysis and process improvement initiatives. Required Qualifications: • Minimum of a Bachelor's degree in Science, with 12-15 years' experience or Master's degree with More ❯
and test network engineering/administration activities. Create and maintain Standard Operating Procedures (SOPs) and technical documentation. Provide follow-up reports (technical findings, feedback, and resolution steps taken) for RootCauseAnalysis and process improvement initiatives. Required Qualifications Top Secret Clearance Minimum of a Bachelor's degree in Science, with 12-15 years' experience or Master's More ❯
activities with the NMCI Operations Manager, NOC Lead, Release Management team and other key stakeholders. •Tier III escalation support and vendor engagement supporting Incident Management activities. •Active participation in RootCauseAnalysis for Problem Management activities. You'll Bring These Qualifications: •Requires B.S. Degree and 8-12 years of prior relevant experience or Masters with More ❯
analyze overall health of Splunk infrastructure to include daily indexing volume, search volume and performance, data source reporting, user activity reporting, and custom apps/dashboards/visualizations. Perform rootcauseanalysis on any issues with recommendations. Implement tactical and strategic solutions to problems. Develop, manage, and maintain documents supporting Splunk architecture and operational processes. Data onboarding More ❯
related issues affecting managed devices Collaborate closely with cross-functional teams, including infrastructure, security, and application teams, to ensure seamless integration and support of managed devices Conduct in-depth rootcauseanalysis and identify trends to prevent recurring issues and minimize service disruptions; Performs implementation and maintenance of authorized software changes, related to assigned applications and the More ❯
and testing efforts to maintain software quality and performance. Support CI/CD pipelines using Jenkins and contribute to automated testing and deployment. Troubleshoot and resolve production issues, performing rootcauseanalysis and providing timely solutions. Mentor junior engineers and share knowledge across the team to foster a collaborative working environment. Basic Qualifications: Bachelor's degree in More ❯
Systems Support team (CIM), Operational Technology Engineers, Data Engineers, and Web Developer Monitoring and reporting on system performance, availability, and incident response metrics Providing leadership in incident management and rootcauseanalysis for system-related issues, while also ensuring effective change control procedures for all changes introduced to the factory (ITIL) Managing and leading a team of More ❯
Stockport, Greater Manchester, North West, United Kingdom
Nexperia
Systems Support team (CIM), Operational Technology Engineers, Data Engineers, and Web Developer Monitoring and reporting on system performance, availability, and incident response metrics Providing leadership in incident management and rootcauseanalysis for system-related issues, while also ensuring effective change control procedures for all changes introduced to the factory (ITIL) Managing and leading a team of More ❯
and legacy systems/technical debt activities Collaborate with Senior Engineers to improve delivery automation and enhance DevEx and self-servicing Aligns to effective incident response processes, helping with rootcauseanalysis and problem resolution during incident management sessions Take ownership and pride in the work you deliver, ensure what is delivered is of quality and takes More ❯
our tools and platforms Collaborate with the team to troubleshoot and resolve issues, shadowing and learning from Mid and Senior-level Engineers Aligns to incident response processes, helping with rootcauseanalysis and problem resolution during incident management sessions Take ownership and pride in the work delivered, ensure what is delivered is of quality and takes into More ❯
Systems Support team (CIM), Operational Technology Engineers, Data Engineers, and Web Developer Monitoring and reporting on system performance, availability, and incident response metrics Providing leadership in incident management and rootcauseanalysis for system-related issues, while also ensuring effective change control procedures for all changes introduced to the factory (ITIL) Managing and leading a team of More ❯
assigned trouble tickets, including incident and deployment tickets, providing timely updates in the ticketing system that is informative of the work performed. Also provide correction of discrepancies identified and rootcauseanalysis to prevent future discrepancies. Distribute peripheral IT equipment supplies (i.e., network printer toner, etc.) Support the maintenance of local area network (LAN) and wide area More ❯
and enhancing Cloud One Infrastructure as Code (IAC). • Responsible for daily system monitoring, security, SW Factory Cloud health, resources, and log management of AWS systems. • Troubleshooting and performing rootcause analysis. This includes troubleshooting all issues with Cloud system configurations, backups, files systems, and user access. • Interface with engineers, systems engineers, and subject matter experts. • Troubleshoot networking More ❯
operational procedures. Mentor team members and contribute to a culture of learning and inclusion. Continuously improving infrastructure reliability and reducing manual work (TOIL). Participating in incident response and rootcause analysis. Why Join Us? Join our team and contribute to a culture of innovation, collaboration, and excellence. If you are ready to advance your career and make More ❯
operational procedures. Mentor team members and contribute to a culture of learning and inclusion. Continuously improving infrastructure reliability and reducing manual work (TOIL). Participating in incident response and rootcause analysis. Why Join Us? Join our team and contribute to a culture of innovation, collaboration, and excellence. If you are ready to advance your career and make More ❯
that cannot be addressed by First or Second Line support. You will play a key role in maintaining and improving the organisation’s IT infrastructure, performing deep-dive diagnostics, rootcauseanalysis, and implementing long-term solutions. In addition to supporting escalated incidents, you will contribute to system design, strategic projects, and continuous service improvement. Key Responsibilities … Expert-Level Support & Issue Resolution Take ownership of high-level, complex incidents and problems escalated from Second Line Support Perform in-depth diagnostics and rootcauseanalysis across infrastructure, systems, and applications Develop and implement long-term fixes and preventative measures to reduce repeat incidents Infrastructure Management & Improvement Maintain, monitor, and optimise servers, storage, networking, and virtual … support role Strong expertise in server administration, networking, virtualisation, and storage solutions Solid understanding of IT security principles and best practices Ability to carry out detailed troubleshooting and perform rootcauseanalysis Experience managing or contributing to technical projects and service improvements Proficiency in tools such as Active Directory, Group Policy, Office 365, Exchange, and Windows Server More ❯