City of London, Greater London, UK Hybrid / WFH Options
Infinigate Group
configuring, updating, and monitoring security tools and software, such as antivirus, encryption, authentication, SIEM etc. Evaluate, research and manage emerging cyber security threats. Support the incident management process, through RootCause Analysis. Responding to and resolving security incidents and events, such as malware infections, phishing attempts, denial-of-service attacks, data breaches, etc. Liaise with stakeholders in relation … Exposure to security monitoring technologies Understanding of Incident Response, Cyber Kill Chain, ATT&CK · Knowledge & experience of common program language e.g., Python, C++, PowerShell, JavaScript Being able to perform RootCauseAnalysis Experience with vulnerability assessments Ability to discover, design and document security implementations. Strong networking skills. Good understanding of securing Cloud technologies through native and multi More ❯
and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including rootcauseanalysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Cognitive Group | Part of the Focus Cloud Group
Cleared or Eligible for SC Clearance Your responsibilities: Deploy, configure, and monitor AWS services ensuring high availability, scalability, and security. Respond to and resolve infrastructure and service incidents with rootcauseanalysis and preventive measures. Handle change requests, track recurring issues, and work on long-term fixes to improve system stability. Implement and maintain observability solutions using … configuration and deployment management experience with CI/CD Desirable skills Hands-on experience with Terraform or CloudFormation for infrastructure provisioning and automation. Strong knowledge of Splunk for log analysis and troubleshooting. Strong problem-solving skills and analytical thinking. More ❯
and compliance requirements. • Act as the primary point of contact for internal business units (including Operations, Compliance & Transactional Banking), IT and external vendors, regarding service performance and enhancements. • Lead rootcauseanalysis and resolution of major incidents. Drive problem management to reduce recurring issues and improve service stability. • Manage projects involving any future enhancements or regulatory changes More ❯
City of London, London, United Kingdom Hybrid / WFH Options
REC SOLUTIONS LIMITED
with development, networks, ops and product teams on strategic IT initiatives. Assist with planning, management and resource allocation of inter-departmental projects alongside the PM team. Oversee incident management, rootcauseanalysis, and rapid resolution of system outages or performance degradation. Ensure compliance of procedures such as change management, patch management and security and audit processes. Assist … understanding of cybersecurity principles and experience implementing security measures in a regulated environment. Ability to coach, mentor, and upskill staff; develop career paths and ensure team resilience. Experience undertaking rootcauseanalysis including prevention orientated solution reporting. Working experience with deployment tools (e.g. GitLab pipelines) and rollback strategies. Proficiency in managing bare-metal servers, virtualization platforms such More ❯
City of London, London, United Kingdom Hybrid / WFH Options
REC SOLUTIONS LIMITED
with development, networks, ops and product teams on strategic IT initiatives. Assist with planning, management and resource allocation of inter-departmental projects alongside the PM team. Oversee incident management, rootcauseanalysis, and rapid resolution of system outages or performance degradation. Ensure compliance of procedures such as change management, patch management and security and audit processes. Assist … understanding of cybersecurity principles and experience implementing security measures in a regulated environment. Ability to coach, mentor, and upskill staff; develop career paths and ensure team resilience. Experience undertaking rootcauseanalysis including prevention orientated solution reporting. Working experience with deployment tools (e.g. GitLab pipelines) and rollback strategies. Proficiency in managing bare-metal servers, virtualization platforms such More ❯
AWS Responsibilities: Monitor security event logs and alerts generated by various security technologies, including SIEM, IDS/IPS, firewalls, and endpoint protection systems. Conduct host forensics, network forensics, log analysis, and malware triage in support of incident response investigations. Identify, analyze, and assess potential insider threats through behavioral analytics, log review, and threat intelligence. Maintain and improve SOC processes … and refine insider risk policies to ensure they are effective and up to date. Develop and implement automated processes for monitoring and enforcing insider risk policies. Participation in security rootcauseanalysis and forensics as part of NorthMark Strategies’ Cyber Incident Response Plan. Develop comprehensive and accurate reports and presentations for both technical and executive audiences. Stay More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Tate Recruitment
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
City of London, London, United Kingdom Hybrid / WFH Options
dnevo Partners
and follow-up actions. Work closely with cross-functional teams on data-related projects and continuous improvement initiatives. Identify and investigate data quality issues, contributing to the development of rootcause analyses and solutions. Stay up-to-date with evolving data technologies, tools, and industry trends. Support the definition of data quality methodologies and standards across the business. More ❯
production systems using tools such as Prometheus, Grafana, or Datadog Collaborate with development and QA teams to improve deployment processes and system reliability Contribute to incident response, troubleshooting, and rootcauseanalysis Requirements Approximately 18 months of experience in a DevOps, Site Reliability, or infrastructure-focused role Working knowledge of Linux-based systems and scripting languages (e.g. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Explore Group
and scale Kubernetes clusters hosting critical microservices Design and enhance observability, alerting, and incident response processes Collaborate closely with engineers to ensure systems are reliable, secure, and performant Lead rootcauseanalysis for production incidents and help prevent recurrence Build tooling to automate repetitive tasks and improve deployment pipelines (CI/CD) Participate in on-call rotation More ❯
prem environments. What You’ll Be Doing: Managing and supporting Solace PubSub+ appliances and software brokers across cloud and on-prem platforms Responding to production incidents and working on rootcauseanalysis and long-term fixes Monitoring system health and performance with Prometheus, Grafana, and custom dashboards Optimising Solace across WAN environments for secure, low-latency message More ❯
City of London, Greater London, UK Hybrid / WFH Options
Halian
endpoints are properly configured and updated. 2nd Line Support: Respond to and resolve escalated 2nd line support tickets, ensuring timely resolution of technical issues. Provide expert-level troubleshooting and rootcauseanalysis for more complex issues. Work closely with end-users, understanding their requirements and delivering technical solutions. Escalate issues to senior engineers as needed while keeping More ❯
Central London, London, United Kingdom Hybrid / WFH Options
Halian Technology Limited
endpoints are properly configured and updated. 2nd Line Support: Respond to and resolve escalated 2nd line support tickets, ensuring timely resolution of technical issues. Provide expert-level troubleshooting and rootcauseanalysis for more complex issues. Work closely with end-users, understanding their requirements and delivering technical solutions. Escalate issues to senior engineers as needed while keeping More ❯
operational performance, and security compliance. Facilitate effective communication between IT teams and business units. Problem Solving and Incident Management: Manage and resolve high-priority incidents and critical issues. Conduct rootcauseanalysis and implement corrective actions to prevent recurrence. Develop and maintain incident response plans and procedures. Requirements: Proven experience as a Digital Operations Manager, IT Manager More ❯
maintain project plans, schedules, and budgets. Manage & control the project costs & financial performance, including approval of Timesheet. Facilitate stakeholder meetings to align project goals and address concerns proactively. Conduct rootcauseanalysis for issues and propose corrective actions. Oversee project scope, risks, and changes, ensuring alignment with project objectives. Prepare and present status reports to clients and … feedback constructively. Required Skills Strong leadership and problem-solving abilities. Excellent communication and interpersonal skills. Proficiency in project management tools and techniques. Ability to work independently with some oversight. Rootcauseanalysis and continuous improvement mindset. Preferred Skills A recognised project management certification, such as CAPM, Prince2, or APM, is preferred. Demonstrable understanding of both Agile and More ❯
a hands-on leadership role - you won’t just guide others, you’ll be the go-to expert when systems are under pressure. You'll lead incident response, own rootcauseanalysis, and solve performance issues like memory leaks, outages, and flaky services. Your focus will include : Leading incident management, post-mortems, and blameless RCAs Building scalable More ❯
advantageous. Experience using a variety of analytical tools and methods to identify security compromises within large and sophisticated data sets. Understanding of techniques and tools to perform forensics and rootcause analysis. Ability to communicate technical issues to non-technical audiences and explain the impact of vulnerabilities or threats in business-focused language. Industry certifications (e.g., CompTIA, CEH More ❯
Multiple Interim Data Governance & Data Analysis Positions – Major Banking Client We are currently recruiting on behalf of a prominent banking client for several Interim Data Governance & Data Analysis roles at varying corporate levels. These positions are central to a high-impact Data Transformation Programme within the organisation’s Data Office. This is an exciting opportunity for data professionals … region by contributing to the strategic development and execution of the organisation’s EMEA Data Strategy . The Ideal Candidate The ideal candidate will bring strong expertise in Data Analysis combined with a working knowledge of: Data Governance principles Data Migration Cloud Transformations Operational Risk management practices This unique blend of skills will enable the successful candidate to provide … both analytical depth and governance oversight, supporting the delivery of a robust and compliant data environment. Key Responsibilities Conduct in-depth data analysis to support governance, quality, and risk assessment across Risk and Finance data assets. Manage data definitions, metadata, and lineage for high-priority data use cases, ensuring consistency and transparency. Collaborate with stakeholders to align business needs More ❯
and efficiency. Automate configuration, provisioning, and deployment to reduce manual effort and streamline operations. Implement and uphold security standards, including encryption, access control, and compliance. Lead incident response and rootcauseanalysis, applying preventive measures to avoid recurrence. Collaborate across teams (QA, DevOps, IT) to troubleshoot and enhance system performance. Maintain clear documentation for configurations, procedures, and … with a focus on Python. Skilled in TDD and BDD, primarily using Python. Deep understanding of distributed systems, networking, storage, and compute management. Strong troubleshooting skills, with experience in rootcauseanalysis and timely resolution. Knowledge of security standards (ISO27001, NIST, GDPR) and infrastructure security best practices. Experienced with monitoring/logging tools like Splunk, Grafana, and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Owen Thomas | Pending B Corp™
and efficiency. Automate configuration, provisioning, and deployment to reduce manual effort and streamline operations. Implement and uphold security standards, including encryption, access control, and compliance. Lead incident response and rootcauseanalysis, applying preventive measures to avoid recurrence. Collaborate across teams (QA, DevOps, IT) to troubleshoot and enhance system performance. Maintain clear documentation for configurations, procedures, and … with a focus on Python. Skilled in TDD and BDD, primarily using Python. Deep understanding of distributed systems, networking, storage, and compute management. Strong troubleshooting skills, with experience in rootcauseanalysis and timely resolution. Knowledge of security standards (ISO27001, NIST, GDPR) and infrastructure security best practices. Experienced with monitoring/logging tools like Splunk, Grafana, and More ❯
Central London / West End, London, United Kingdom Hybrid / WFH Options
Owen Thomas | Pending B Corp™
and efficiency. Automate configuration, provisioning, and deployment to reduce manual effort and streamline operations. Implement and uphold security standards, including encryption, access control, and compliance. Lead incident response and rootcauseanalysis, applying preventive measures to avoid recurrence. Collaborate across teams (QA, DevOps, IT) to troubleshoot and enhance system performance. Maintain clear documentation for configurations, procedures, and … with a focus on Python. Skilled in TDD and BDD, primarily using Python. Deep understanding of distributed systems, networking, storage, and compute management. Strong troubleshooting skills, with experience in rootcauseanalysis and timely resolution. Knowledge of security standards (ISO27001, NIST, GDPR) and infrastructure security best practices. Experienced with monitoring/logging tools like Splunk, Grafana, and More ❯
City of London, England, United Kingdom Hybrid / WFH Options
Owen Thomas | Pending B Corp™
and efficiency. Automate configuration, provisioning, and deployment to reduce manual effort and streamline operations. Implement and uphold security standards, including encryption, access control, and compliance. Lead incident response and rootcauseanalysis, applying preventive measures to avoid recurrence. Collaborate across teams (QA, DevOps, IT) to troubleshoot and enhance system performance. Maintain clear documentation for configurations, procedures, and … with a focus on Python. Skilled in TDD and BDD, primarily using Python. Deep understanding of distributed systems, networking, storage, and compute management. Strong troubleshooting skills, with experience in rootcauseanalysis and timely resolution. Knowledge of security standards (ISO27001, NIST, GDPR) and infrastructure security best practices. Experienced with monitoring/logging tools like Splunk, Grafana, and More ❯
West End of London, England, United Kingdom Hybrid / WFH Options
Owen Thomas | Pending B Corp™
and efficiency. Automate configuration, provisioning, and deployment to reduce manual effort and streamline operations. Implement and uphold security standards, including encryption, access control, and compliance. Lead incident response and rootcauseanalysis, applying preventive measures to avoid recurrence. Collaborate across teams (QA, DevOps, IT) to troubleshoot and enhance system performance. Maintain clear documentation for configurations, procedures, and … with a focus on Python. Skilled in TDD and BDD, primarily using Python. Deep understanding of distributed systems, networking, storage, and compute management. Strong troubleshooting skills, with experience in rootcauseanalysis and timely resolution. Knowledge of security standards (ISO27001, NIST, GDPR) and infrastructure security best practices. Experienced with monitoring/logging tools like Splunk, Grafana, and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Stott and May
review, design, and implement infrastructure decisions. Maintain documentation for platforms, services, and pipelines. Audit activities to ensure compliance with security policies (including PCI DSS, GDPR, and PII). Perform root‐causeanalysis and implement improvements to prevent incidents and optimize performance. Maintain and evolve monitoring platforms, including synthetic and application monitoring, responding to alerts and identifying bottlenecks. More ❯