Terraform, Ansible. Monitor, troubleshoot, and optimize systems, networks, and application performance across hybrid environments. Collaborate with security, development, and operations teams to enforce DevSecOps best practic-es. Participate in incidentresponse, root cause analysis, and implement long-term fixes. Maintain and document configurations, processes, and network topologies. Required Qualifications Extensive hands-on experience with F5 load balancers expertise. More ❯
security event information across multiple technologies. Creating security use cases to enable the wider SOC to respond to a wider array of threats. Identify where automation can assist the IncidentResponse team when investigating suspicious activity. Creation of analytic content to enable quantifiable metrics on SOC performance. What are BAE Systems looking for from you? A strong technical More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
Wythenshawe Community Housing Group
the development, security, and resilience of WCHGs ICT infrastructure Act as product expert for Azure/M365 and on-premise solutions Own and manage ICT cyber security processes, including incidentresponse Mentor and supervise the ICT Infrastructure Engineer and wider technology team Lead on data backup, replication, and disaster recovery testing Provide final line (4th line) technical support More ❯
specific technical skills This role can be based in our London, Knutsford or Glasgow locations. Purpose of the role To apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring … maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance More ❯
specific technical skills This role can be based in our London, Knutsford or Glasgow locations. Purpose of the role To apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring … maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance More ❯
ACM). Knowledge of compliance requirements that impact cloud security (e.g., GDPR, HIPAA, SOC 2) and experience in implementing controls to meet these requirements. Ability to design and execute incidentresponse strategies within the AWS cloud, including the use of AWS CloudWatch, AWS Lambda, and other automated response tools. More ❯
storage systems, and related infrastructure in line with Change Management processes. Administer and monitor servers and systems, ensuring performance, updates, patches, and issue resolution. Manage network and infrastructure troubleshooting, incidentresponse, on-call support, and site visits as required. Contribute to backup, disaster recovery, and security best practices to safeguard data and systems. Provide technical support, resolve infrastructure More ❯
including voice AI, automation, and predictive tools Overhaul the legacy CRMs UI/UX into a modern, high-performance platform Cybersecurity & Risk Management Own enterprise cybersecurity strategy, audits, and incidentresponse Design post-attack processes and lead quarterly vulnerability assessments Infrastructure & Performance Optimise PHP/MySQL stack for speed, uptime, and stability Resolve CRM bottlenecks and implement diagnostic More ❯
voice AI, automation, and predictive tools Overhaul the legacy CRM’s UI/UX into a modern, high-performance platform Cybersecurity & Risk Management Own enterprise cybersecurity strategy, audits, and incidentresponse Design post-attack processes and lead quarterly vulnerability assessments Infrastructure & Performance Optimise PHP/MySQL stack for speed, uptime, and stability Resolve CRM bottlenecks and implement diagnostic More ❯
including voice AI, automation, and predictive tools Overhaul the legacy CRMs UI/UX into a modern, high-performance platform Cybersecurity & Risk Management Own enterprise cybersecurity strategy, audits, and incidentresponse Design post-attack processes and lead quarterly vulnerability assessments Infrastructure & Performance Optimise PHP/MySQL stack for speed, uptime, and stability Resolve CRM bottlenecks and implement diagnostic More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Moonpig
Solid understanding of cryptography, authentication and authorisation A great communicator with a collaborative, pragmatic mindset Ideally have experience measuring and improving security via tooling metrics Ideally have exposure to incidentresponse or threat modelling Ideally knowledge of securing serverless or containerised environments If you have a background in software engineering and have a keen interest and solid understanding More ❯
Chester, Cheshire, United Kingdom Hybrid / WFH Options
Whelen Engineering
and Responsibilities Lead and mentor the IT help desk, systems,and network teams, ensuring high performance and professional growth. Oversee the day-to-day delivery of IT services, including incidentresponse, service requests, system availability, and infrastructure support, while prioritizing and maintaining production systems uptime Manage work in the ticketing system (Jira), ensuring timely response, prioritization, and More ❯
critical technology infrastructure and resolve more multi-faceted technical issues, whilst minimizing disruption to operations. In this role you will apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. To be successful as a Database Engineer, you should have: Experience in … is reliable, scalable, and secure. Ensure the reliability, availability, and scalability of the systems, platforms, and technology through the application of software engineering techniques, automation, and best practices in incident response. Accountabilities Build Engineering: Development, delivery, and maintenance of high-quality infrastructure solutions to fulfil business requirements ensuring measurable reliability, performance, availability, and ease of use. Including the identification … of the appropriate technologies and solutions to meet business, optimisation, and resourcing requirements. Incident Management: Monitoring of IT infrastructure and system performance to measure, identify, address, and resolve any potential issues, vulnerabilities, or outages. Use of data to drive down mean time to resolution. Automation: Development and implementation of automated tasks and processes to improve efficiency and reduce manual More ❯
critical technology infrastructure and resolve more multi-faceted technical issues, whilst minimizing disruption to operations. In this role you will apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Check all associated application documentation thoroughly before clicking on the apply button … is reliable, scalable, and secure. Ensure the reliability, availability, and scalability of the systems, platforms, and technology through the application of software engineering techniques, automation, and best practices in incident response. Accountabilities Build Engineering: Development, delivery, and maintenance of high-quality infrastructure solutions to fulfil business requirements ensuring measurable reliability, performance, availability, and ease of use. Including the identification … of the appropriate technologies and solutions to meet business, optimisation, and resourcing requirements. Incident Management: Monitoring of IT infrastructure and system performance to measure, identify, address, and resolve any potential issues, vulnerabilities, or outages. Use of data to drive down mean time to resolution. Automation: Development and implementation of automated tasks and processes to improve efficiency and reduce manual More ❯
Newcastle Upon Tyne, Tyne and Wear, England, United Kingdom Hybrid / WFH Options
Lorien
Code (IaC) practices using Terraform. Building and optimising CI/CD pipelines to accelerate delivery. Implementing and maintaining monitoring and observability with Prometheus and Grafana. Enabling team collaboration and incidentresponse through Slack and other ChatOps tools. Leading, mentoring, and supporting engineers (or preparing to step into people management if you're progressing into the role). Key More ❯
RAG, and prompt engineering Familiarity with Azure services and cloud ecosystems Excellent communication and presentation skills A passion for mentoring and developing engineering talent Experience with distributed systems and incidentresponse Benefits: Flexible remote working Competitive salary 25 days holiday Private health insurance (after 1 year) Enhanced parental leave And more Please Note: This is a permanent role More ❯
Liverpool, Merseyside, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
RAG, and prompt engineering Familiarity with Azure services and cloud ecosystems Excellent communication and presentation skills A passion for mentoring and developing engineering talent Experience with distributed systems and incidentresponse Benefits: Flexible remote working Competitive salary 25 days holiday Private health insurance (after 1 year) Enhanced parental leave And more Please Note: This is a permanent role More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
Tenth Revolution Group
RAG, and prompt engineering Familiarity with Azure services and cloud ecosystems Excellent communication and presentation skills A passion for mentoring and developing engineering talent Experience with distributed systems and incidentresponse Benefits: Flexible remote working Competitive salary 25 days holiday Private health insurance (after 1 year) Enhanced parental leave And more Please Note: This is a permanent role More ❯
South Shields, Tyne and Wear, England, United Kingdom
Jackson Hogg - Tech
Develop and manage integrations (APIs/EDIs) between operational systems, ERP, and external partners Collaborate with internal teams and external vendors to enhance system functionality and resolve issues Lead incidentresponse for system outages and functionality failures, keeping stakeholders informed Train users and create training materials to support effective use of systems Maintain technical documentation for systems, configurations More ❯
well as job-specific technical skillsThis role can be based in our Knutsford, or Glasgow, locations. Purpose of the roleTo apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. AccountabilitiesAvailability, performance, and scalability of systems and services through proactive monitoring, maintenance … and capacity planning.Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring.Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning.Collaboration with development teams More ❯
specific technical skills This role can be based in our London, Knutsford or Glasgow, locations. Purpose of the role To apply software engineering techniques, automation, and best practices in incidentresponse, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them. Accountabilities Availability, performance, and scalability of systems and services through proactive monitoring … maintenance, and capacity planning. Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring. Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience. Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance More ❯
Chester, Cheshire West and Chester, Cheshire, United Kingdom
Ascendion
teams, infrastructure, and DevOps to address platform issues and implement improvements. Architect and develop resilient backend systems primarily using Java, Spring, Kafka, and Oracle. Implement best practices for observability, incidentresponse, and operational excellence in line with SRE principles. Drive automation and self-healing mechanisms across platform components. Provide technical leadership and hands-on coding as needed. Monitor … engineering experience. Strong Java expertise with deep understanding of backend design patterns and frameworks (Spring Boot preferred). Proven experience in Site Reliability Engineering (SRE), including monitoring, alerting, and incident management. Hands-on experience with Kafka, MuleSoft, and Oracle DB. Familiarity with performance tuning, system design, and distributed computing concepts. Experience with CI/CD pipelines and infrastructure-as More ❯
Barrow-in-Furness, Cumbria, England, United Kingdom Hybrid / WFH Options
F5
in a managed services environment Experience as a Service Line Design Authority across large accounts, bids, or transitions Understanding of contracts and commercials in service delivery Ability to lead incidentresponse and engage with senior stakeholders Client-facing consulting skills with a focus on technology strategy Technical expertise required: Cisco ACI software-defined networks (multi-site & multi-pod More ❯
is reliable, scalable, and secure. Ensure the reliability, availability, and scalability of the systems, platforms, and technology through the application of software engineering techniques, automation, and best practices in incident response. Vice President Expectations To contribute or set strategy, drive requirements and make recommendations for change. Plan resources, budgets, and policies; manage and maintain policies/processes; deliver continuous More ❯
Newcastle Upon Tyne, Tyne and Wear, North East, United Kingdom Hybrid / WFH Options
The Bridge (IT Recruitment) Limited
SNOPs Lead to adapt the SNOPs roadmap priorities in line with shifts in industry, evolving threat landscape and regulatory requirements. Ensure effective 24/7 security operations (inc. security incident management) Collaborate closely with the Enterprise Resilience function (1st Line of Defence) to ensure integrated risk management and incident response. Promote stakeholder engagement and cross-functional collaboration to … a culture of security awareness and ownership across the organisation. Operational Oversight Ensure high availability, performance, and security of all technology systems and infrastructure. Monitor and improve service levels, incident resolution times, and system reliability metrics. Lead cross-functional coordination for escalations, major incidents, and service continuity planning. Team Leadership & Development Provide leadership and direction to platform tower leads … a complex, global environment. Deep understanding of IT infrastructure, cloud platforms (e.g., Azure), and enterprise collaboration tools (e.g., Microsoft 365). Strong grasp of ITIL-based service management, including incident, change, and problem management. Expertise in security and compliance frameworks, including DORA and Cyber Essentials Plus. Prior hands-on experience in delivering security solutions within enterprise environments Knowledge of More ❯