Sheffield, Yorkshire, United Kingdom Hybrid / WFH Options
Experis - ManpowerGroup
and GCP , ensuring resilience, cost-efficiency, and data security. Collaborate closely with infrastructure, architecture, and cybersecurity teams to meet internal risk, compliance, and governance requirements. Support live systems, perform rootcauseanalysis, and implement solutions for incidents and performance bottlenecks. Qualifications and experience The ideal candidate for this role will have the below experience and qualifications: Bachelor More ❯
and GCP, ensuring resilience, cost-efficiency, and data security. Collaborate closely with infrastructure, architecture, and cybersecurity teams to meet internal risk, compliance, and governance requirements. Support live systems, perform rootcauseanalysis, and implement solutions for incidents and performance bottlenecks. Required Skills/Experience The ideal candidate will have the following: Bachelor's or Master's degree More ❯
and test network engineering/administration activities. Create and maintain Standard Operating Procedures (SOPs) and technical documentation. Provide follow-up reports (technical findings, feedback, and resolution steps taken) for RootCauseAnalysis and process improvement initiatives. Required Qualifications Top Secret Clearance Minimum of a Bachelor's degree in Science, Technology, Engineering and Math (preferred) with More ❯
Reading, Berkshire, United Kingdom Hybrid / WFH Options
Pertemps
you'll be doing as a Senior Cyber Security Analyst: Security Incident Response : Investigate security alerts from SIEM and third-party MSSPs, triage and respond to incidents, and support rootcauseanalysis to drive remediation. Stakeholder Engagement : Work closely with technology and business teams to communicate cyber risks, recommend actions, and ensure controls are proportionate and effective. More ❯
all related network and endpoint security components. Collaborate with the Information Security Specialist to validate ZTA effectiveness through testing, exercises, and real-time monitoring. Lead post-incident forensics and rootcauseanalysis to ensure rapid containment, mitigation, and capture lessons learned to reduce future system compromise. Guide configuration baselines and system hardening strategies aligned with RMF controls … and mission risk profiles. Conduct rigorous system testing, security drills, and continuous monitoring to validate enforcement and effectiveness of controls and provide in-depth post-incident analysis in response to any breaches or anomalies. Identify risk management practices, create incident response procedures/planning, and champion a cybersecurity-aware culture through staff training, technical mentorship, and stakeholder engagement. Develop More ❯
IT Service Management (ITSM) processes across all teams, ensuring standardized, efficient, and effective service delivery. EstablishSRE-based operational metrics, includingSLOs, SLIs, and error budgets. Overseeincident response, problem resolution, and rootcauseanalysis with AI-driven remediation. Ensurehigh availability, performance, and security compliancefor all enterprise services. Develop afollow-the-sun operational support model, ensuring24x7 resilience and uptime across More ❯
Falls Church, Virginia, United States Hybrid / WFH Options
Epsilon Inc
of data between systems by helping with Extract, Transform, Load (ETL) processes and ensuring data consistency across different platforms. Monitor and Troubleshoot Database Performance Issues - Identify potential bottlenecks, perform rootcauseanalysis, and work with senior architects to implement solutions that enhance database reliability and efficiency. Support Compliance and Regulatory Requirements - Ensure database structures and data management More ❯
expectations in partnership with a member of the Project Management team or acting as project Lead. Your Responsibilities: Support incident management for the support team, ensuring robust troubleshooting and rootcause analysis. Ability to support and resolve incidents effectively for the support team This role is joint Application Management Service (support) team and project implementation Collaborate with functional More ❯
expectations in partnership with a member of the Project Management team or acting as project Lead. Your Responsibilities: Support incident management for the support team, ensuring robust troubleshooting and rootcause analysis. Ability to support and resolve incidents effectively for the support team This role is joint Application Management Service (support) team and project implementation Collaborate with functional More ❯
reliability engineer develops and implements solutions to prevent them, ultimately enhancing the reliability of systems, equipment, and processes. Responsibilities: Analyzing equipment failure data to detect patterns and trends. Conducting rootcauseanalysis to identify the underlying causes of issues. Creating and implementing new maintenance procedures. Designing and establishing new protocols for monitoring and testing equipment. Exploring new … is incorporated into all areas of the organization. System Reliability: Design and implement strategies to improve the availability, reliability, and performance of critical systems and applications. Incident Management: Lead rootcauseanalysis for major incidents, identify systemic issues, and implement long-term solutions to prevent recurrences. Monitoring and Alerting: Develop and maintain robust monitoring systems to detect … issues proactively and optimize alerting mechanisms to ensure timely response. Capacity Planning: Analyze system usage patterns to predict future growth, optimize capacity, and ensure scalability. Failure Analysis: Conduct thorough failure analysis and implement fault tolerant systems to minimize the impact of potential failures. Collaboration: Work closely with software engineering, DevOps, and infrastructure teams to design reliable architecture and More ❯
is for Team B Day Shift, the hours are 7 AM-7 PM Thursday - Saturday and every other Sunday. Responsibilities: Monitor network traffic for security events and perform triage analysis to identify security incidents. Respond to computer security incidents by collecting, analyzing, and preserving digital evidence, and ensure that incidents are recorded and tracked in accordance with SOC requirements. … Document actions taken and create technical reports detailing investigation efforts and case outcome to SOC Management and the client. Utilize technologies to conduct host forensics, Endpoint Detection & Response, log analysis, and network forensics (full packet capture solution). Provide cybersecurity root-causeanalysis and investigative alerts to examine endpoint activity and network-based data. Conduct malware … analysis, host and network, forensics, log analysis, and triage in support of incident response. Recognize attacker and APT activity, tactics, and procedures as indicators of compromise (IOCs) that can be used to improve monitoring, analysis, and incident response. Develop and build security content, scripts, tools, or methods to enhance the incident investigation processes. Isolate and remove malware. More ❯
hands-on role supporting high-availability systems, rapid deployments, and production incident response. Key Responsibilities - Manage and monitor AWS infrastructure for performance and security - Respond to production incidents, perform rootcauseanalysis, and implement fixes - Maintain observability tools (Prometheus, Grafana, Splunk) and write PromQL queries - Improve and operate CI/CD pipelines using GitHub Actions and Kubernetes … Prometheus, Grafana, Splunk, and PromQL - Proficient in scripting (Python, Go, Bash, SQL) - Skilled in GitHub, CI/CD, and Kubernetes operations Desirable: - Experience with Terraform or CloudFormation - Advanced log analysis with Splunk - Strong problem-solving and analytical thinking More ❯
to-end tests on code commits and pull-requests. • Monitor pipeline health and test results; collaborate with DevOps to optimize build times, parallelize tests, and reduce pipeline flakiness. Result Analysis & RootCause • Analyze test outputs, system logs, and metrics (e.g., via ELK Stack or Prometheus/Grafana) to pinpoint failures and performance regressions. • Lead root-cause … testing activity efficiently. An ISTQB Foundation Certification is a strong asset and shows your commitment to professional testing standards. A key part of this role involves problem investigation and rootcauseanalysis, so strong analytical and communication skills are a must. You'll enjoy working as part of a collaborative team, contributing your insights to improve outcomes More ❯
cloud environments, including compute and storage scalability Containerisation & Virtualisation: Familiarity with virtual and physical server provisioning, especially in strategic data centres Platform Resilience & Observability: Designing for uptime, performance, and rootcause analysis. Web Services & APIs: Used for Integration with 24+ LBGI systems Batch Processing: Understanding of batch suite performance and scheduling constraints RPA & Automation (Batching): Familiarity with robotic … process automation Log Aggregation & Analysis: Tooling for log interrogation and rootcauseanalysis (e.g., Splunk, Dynatrace). Dashboarding: Real-time analytics dashboards for infrastructure and application health Support & Troubleshooting: Remote operations, incident response, and environment health checks. About working for us Our ambition is to be the leading UK business for diversity, equity and inclusion supporting More ❯
implement scalable, resilient, and secure infrastructure solutions aligned to organisational strategy Lead BAU operations across networks, firewalls, hosting platforms, and server endpoints Proactively monitor systems, troubleshoot issues, and conduct rootcauseanalysis Own disaster recovery and business continuity planning, testing, and documentation Act as a subject matter expert on infrastructure and cybersecurity best practice Mentor junior engineers … Certifications such as ITIL, CCNA, Microsoft, VMware, or Citrix preferred Familiarity with automation tools (Ansible, Terraform) is a bonus Leadership and mentoring capabilities Data-driven decision-making and performance analysis Vendor and stakeholder management Strong problem-solving and risk mitigation skills Customer-focused with an eye for service delivery improvements Excellent communication and strategic thinking abilities If you are More ❯
Dynamics 365 (D365) Finance and Operations, Business Central (F&O), or comparative ERP systems. ( Certification in Dynamics 365 or a related ERP system is desirable). Experience with data analysis, process mapping, rootcauseanalysis and problem-solving in an ERP environment. Excellent communication and collaboration skills with internal and external stakeholders, with the ability to More ❯
disciplinary teams, ensuring alignment with product and business goals. Provide mentorship and technical guidance to less experienced engineers. Promote collaboration across international and distributed teams. Engage in system architecture, rootcauseanalysis, and continuous integration processes What We're Looking For: Degree in Computer Science, Software Engineering, or a related field. Professional level expertise in C++ development … Fitnesse, Cucumber), and hardware debuggers (e.g., Lauterbach) is beneficial. Familiarity with configuration management, including version control, automated build systems, release management, and technical documentation. Strong analytical skills in requirements analysis, user story development, backlog management, and estimation. Excellent communication, leadership, and interpersonal skills, with the ability to collaborate across teams and influence stakeholders. Experience in industrial printing or related More ❯
Monitor production systems and infrastructure, ensuring uptime and performance metrics are met Troubleshoot, diagnose, and resolve production issues in real time, minimizing service impact Manage incident response, including escalation, rootcauseanalysis, and post-mortem reporting Collaborate with engineering teams to develop and implement monitoring tools, alert systems, and automated recovery processes Analyze system logs, metrics, and More ❯
entity. Serve as a senior incident responder, addressing emerging threats across the environment. Collaborate with infrastructure, network, and cross-functional teams to contain, investigate, and remediate security incidents. Conduct rootcauseanalysis and participate in forensic investigations as needed. Enhance system visibility by expanding logging coverage and implementing additional monitoring capabilities. Maintain, update, and regularly test incident … Ability to manage time and prioritize work to maximize productivity Excellent communication skills (both written and verbal) Exceptional attention to detail and quality Excellent problem-solving techniques and trouble analysis skills Endpoint security concepts, controls, and best practices for Servers (e.g. Windows and Linux) General IT networking concepts, protocols, standards and network security concepts, controls, and best practices Cryptography More ❯
need. While we obsess over incident response, in this role you will also develop tools to scale our service quality, and provide critical input for product prioritization to address root causes of why the customer experienced an incident in the first place. Our advertising customers are likely Amazon customers, and we take seriously maintaining the high customer service bar … set by Amazon. Key job responsibilities - Independently handling complex customer issues by reproducing cases, rootcauseanalysis, and providing prioritization input - Demonstrating deep technical expertise and advanced problem-solving for critical programmatic advertising issues - Serving as an escalation point, owning resolution of the most complex, cross-organizational issues - Communicating directly with internal teams to investigate, define workarounds More ❯
new and updated system changes Developing, executing, and improving documentation for installation, configuration, hardening, and operations and maintenance tasks Ensuring compliance with IT infrastructure standards, policies, and procedures Conducting rootcauseanalysis and resolving system and application faults and errors Ensuring operating systems and applications comply with Department of Defense (DoD) guidelines, including DISA Security Technical Implementation More ❯
operations and maintenance tasks Document activities, status, and issues worked on Provide input to and follow Configuration Management processes Ensure adherence to IT infrastructure standards, policies, and procedures Perform rootcauseanalysis and resolve system and application faults and errors Maintain working knowledge of Microsoft Active Directory, Group Policy Objects (GPOs), DHCP, DNS, and PowerShell General understanding More ❯
of infrastructure components. 2. Monitoring and Incident Management: - Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues. - Participate in incident response and rootcauseanalysis efforts to drive continuous improvement and prevent future incidents. 3. Reliability and Performance Optimization: - Optimise system performance, reliability, and cost efficiency through continuous monitoring, performance More ❯
Shrivenham, Oxfordshire, United Kingdom Hybrid / WFH Options
Gold Group
Collaborate with engineering teams to support unified access devices (UADs), endpoint management, and virtualized environments. * Provide hands-on support for automation scripts, workflows, and infrastructure monitoring tools. * Contribute to rootcauseanalysis efforts for recurring platform incidents. * Support capacity planning and performance optimization by analysing system usage and trends. * Offer feedback on tools and processes, identifying improvements More ❯
and test network engineering/administration activities. • Create and maintain Standard Operating Procedures (SOPs) and technical documentation. • Provide follow-up reports (technical findings, feedback, and resolution steps taken) for RootCauseAnalysis and process improvement initiatives. Required Qualifications: • Minimum of a Bachelor's degree in Science, with 12-15 years' experience or Master's degree with More ❯