Head of Tech Operations Lead . The role holder will: Lead the Site Reliability Engineering (SRE) function to ensure platform stability, scalability, and performance. Own and evolve the major incidentmanagement process, including escalation, resolution, and post-mortem analysis and follow-up to closure. Drive resilience and disaster recovery strategy, ensuring regular testing and readiness across all critical … mentoring, and resource allocation to achieve team goals. Oversee technology budgets, prioritizing resources, optimizing costs, and ensuring investments support business goals while mitigating financial risks. Director Expectations Advise senior management and committees, influencing decisions and contributing to strategic initiatives. Manage resourcing, budgeting, and policy creation for a significant sub-function. Ensure compliance with policies and regulations, monitoring external environment More ❯
BBC output as well as a wide range of other production support services. This dayside role works as part of a shift-working team providing 24 hour operational support, incidentmanagement and stakeholder communication for a range of BBC platforms and services. YOUR KEY RESPONSIBILITIES AND IMPACT: The prime responsibility of the role is to provide 24/… Audition Knowledge of Apple IOS in a support environment Ability to support remote workers using a number of applications. OTHER SKILLS DESIRED: An understanding of the use of content management systems in large internet sites. Strong understanding of Intel based hardware. Knowledge and understanding of security issues in a large corporate networked environment. Knowledge of TCP/IP over More ❯
Central London, London, England, United Kingdom Hybrid / WFH Options
Reed
and VDI platforms. Ability to build desktops from scratch and a deep understanding of Windows security constructs. Excellent problem-solving skills and ability to manage Change, Problem, and IncidentManagement processes effectively. Day-to-day of the role: Provide analysis, troubleshooting, implementation, administration, security, and maintenance of the Windows Desktop/Active Directory system. Manage end-user Windows More ❯
Dubai, Whitechapel, Greater London, United Kingdom Hybrid / WFH Options
VIQU IT
driven leader with deep technical cybersecurity expertise Proven experience managing SOC, SIEM, and SOAR operations In-depth knowledge of NIST CSF, ISO 27001, and GDPR Strong experience in cybersecurity incidentmanagement Bachelor’s degree in Cybersecurity, Computer Science, IT, or related field 10–15 years of professional experience in cybersecurity, including leadership roles Hands-on knowledge of next More ❯
Greater London, Whitechapel, United Kingdom Hybrid / WFH Options
VIQU IT
driven leader with deep technical cybersecurity expertise Proven experience managing SOC, SIEM, and SOAR operations In-depth knowledge of NIST CSF, ISO 27001, and GDPR Strong experience in cybersecurity incidentmanagement Bachelor’s degree in Cybersecurity, Computer Science, IT, or related field 10–15 years of professional experience in cybersecurity, including leadership roles Hands-on knowledge of next More ❯
scenarios Understanding of ITSO/Barcodes fulfilment methods as well as more traditional methods of fulfilment Able to record and maintain accurate and timely data in ASSIST/JIRA IncidentManagement System relating to accreditations that are being undertaken and any incidents that may arise in accreditation or live pilot running. Able to extract, understand and analyse multiple More ❯
not limited to: EC2, S3, EKS, DynamoDB, EBS, Cloud formation, Lambda, VPC, Route 53 Experience operating in core SDLC CI/CD processes, along with SRE concepts - Monitoring, Alerting, Incident management. Worked within DevOps operating model, data analytics, various models and application of AI/ML in this space. BS degree in computer science or equivalent field Preferred Qualifications More ❯
TDD, CI/CD and pairing using tools like Git and GitHub. Experience of operationally managing software components once live, including; observability, logging, metrics, error reporting, debugging and live incident management. Experience of working with sensitive personal data. Competencies Experience working in/with cross-functional teams consisting of e.g. engineers, product, UX and non-technical stakeholders. Ability to More ❯
TDD, CI/CD and pairing using tools like Git and GitHub. Experience of operationally managing software components once live, including; observability, logging, metrics, error reporting, debugging and live incident management. Experience of working with sensitive personal data. Competencies Strong experience working and collaborating with vendors/partners. Experience working in/with cross-functional teams consisting of engineers More ❯
This is a unique opportunity to get involved in guiding and supporting the organisation in understanding and implementing effective information security controls, as well as ensuring risk and compliance management aligns with the business's risk appetite Role: Information Security Consultant Contract Type: Full time, Permanent Location: Holborn, London Why You'll Love It Here Healthcar e: Individual & Family … BUPA healthcare Discounts : Up to 60% discount on Premier Inn stays and 25% discount on our Restaurant brand As an InfoSec Consultant, you will Support the effective management and resolution of Information Security incidents and/or data breaches following defined IncidentManagement processes. Alongside this, you will also monitor key controls across the areas you support More ❯
You'll act as the first line of defense for data-related incidents , rapidly diagnose root causes, and implement resilient solutions that keep critical reporting systems up and running. IncidentManagement & Triage Serve as on-call escalation for data pipeline incidents, including real-time stream failures and batch job errors. Rapidly analyze logs, metrics, and trace data to … pinpoint failure points across AWS, Flink, Kafka, and Python layers. Lead post-incident reviews: identify root causes, document findings, and drive corrective actions to closure. Reliability & Monitoring Design, implement, and maintain robust observability for data pipelines: dashboards, alerts, distributed tracing. Define SLOs/SLIs for data freshness, throughput, and error rates; continuously monitor and optimize. Automate capacity planning, scaling … to runbooks, design docs, and on-call playbooks detailing common failure modes and recovery steps. Work cross-functionally with DevOps, Security, and Product teams to align reliability goals and incident response workflows. Enhanced leave - 38 days inclusive of 8 UK Public Holidays Private Health Care including family cover Life Assurance - 5x salary Flexible working-work from home and/ More ❯
trading, research and development teams in a fast paced, data driven environment, supporting DevOps applications so will have exposure to a range of DevOps practices including cloud, containerisation, configuration management and monitoring. Not only will you be able to upskill on your DevOps knowledge and work with the latest technology, but you will also receive a market leading salary … and bonuses whilst being surrounded by an array of talented professionals all working in a stable firm. Requirements: Strong knowledge across Application Support and incidentmanagement Experience with Linux and Windows Exposure to DevOps tools - AWS, Terraform, Docker/Kubernetes Experience with Scripting preferably Python/Bash If you are looking for an opportunity to work in a More ❯
improvements. Maintain fault-tolerant, scalable, and cost-effective infrastructures and services. Monitor availability, latency, and system health to keep our platform running smoothly. Lead blameless postmortems and refine our incident response processes. Provide feedback loops to development teams on operational gaps and resiliency concerns. Support services before they go live with system design consulting, capacity planning, and launch reviews. … SRE principles at scale, including deep knowledge of SLI/SLO/SLA differences. A product engineering background with strong coding skills in Python, C#, or similar. Experience with incidentmanagement frameworks and evolving them for efficiency. Expertise in cloud platforms (AWS preferred) and container orchestration (Docker, Kubernetes, ECS). Solid understanding of microservices, service mesh, and modern More ❯
cyber threats/attacks and has the appropriate recovery mechanisms in place. Ensuring FOS is prepared for and can effectively detect and respond to critical incidents by implementing cyber incidentmanagement processes. Continuously educating our people on information security awareness and working closely with our L&D colleagues to ensure that training and educational courses are in place. … information and security strategy and governance. Experience of leading and managing a team and a budget. Experience of managing a 3rd party service and hybrid teams in a matrix management model. Desirable Criteria CISSP, CISM or CRISC certification and some formal training in information security standards or significant professional experience. Why Financial Ombudsman Service? We are a values led More ❯
rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incidentmanagement across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal candidate will be an More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment
rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries. In this role you'll take ownership of platform reliability, resilience engineering, and incidentmanagement across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems. The ideal candidate will be an More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
culture rooted in integrity, creativity, and technical excellence, they've become a trusted partner across global industries.In this role you'll take ownership of platform reliability, resilience engineering, and incidentmanagement across cutting-edge cloud infrastructure. You'll play a key role in ensuring uptime, performance, and continuous improvement of core systems.The ideal candidate will be an experienced More ❯
similar role. Strong understanding of system architecture, design principles, and cloud platforms. Proficiency in scripting languages (e.g., python) for automation purposes. Familiarity with monitoring tools (e.g., prometheus, grafana) and incidentmanagement systems. Effective communication skills to convey technical concepts to non-technical stakeholders. Adaptability to learn and implement new technologies and tools as required. This role is based More ❯
this position. Key responsibilities of the role: Lead on troubleshooting iTrent system issues and responding to customer enquiries, ensuring all actions and outcomes are documented in line with agreed incidentmanagement procedures Develop reports using a range of data extraction and report-writing tools to support HR and business operations Provide ongoing support for key system interfaces within More ❯
this position. Key responsibilities of the role: Lead on troubleshooting iTrent system issues and responding to customer enquiries, ensuring all actions and outcomes are documented in line with agreed incidentmanagement procedures Develop reports using a range of data extraction and report-writing tools to support HR and business operations Provide ongoing support for key system interfaces within More ❯
similar role. Strong understanding of system architecture, design principles, and cloud platforms. Proficiency in scripting languages (e.g., python) for automation purposes. Familiarity with monitoring tools (e.g., prometheus, grafana) and incidentmanagement systems. Effective communication skills to convey technical concepts to non-technical stakeholders. Adaptability to learn and implement new technologies and tools as required. This role is based More ❯
connectivity experience is achieved through a creative & action-orientated attitude to implement and manage support strategies and operational processes for new and existing services to include partner on-boarding, incidentmanagement Establishing and maintaining strong client relationships internally and externally through collaboration & inclusiveness, using the power of diverse teams & individuals working together to deliver outstanding performance. Creating, developing More ❯
passion to identify or develop strategies to mitigate manual intervention going forward and have a track record of designing and implementing cloud best practices (e.g. architecting, provisioning, deployment, monitoring, incidentmanagement, continual service improvement, cost containment, etc.) - Ability to work globally PREFERRED QUALIFICATIONS - Experience as a Senior Solutions Architect/Consultant supporting Cloud workloads across a variety of More ❯
passion to identify or develop strategies to mitigate manual intervention going forward and have a track record of designing and implementing cloud best practices (e.g. architecting, provisioning, deployment, monitoring, incidentmanagement, continual service improvement, cost containment, etc.) - Ability to work globally PREFERRED QUALIFICATIONS - Experience as a Senior Solutions Architect/Consultant supporting Cloud workloads across a variety of More ❯
skills for collaboration and documentation. Effective team player, supporting Senior Developers. Expertise in UiPath Orchestrator, Blue Prism Control Room, or Automation Anywhere Control Room. Proficient in monitoring tools and incident management. Knowledge of governance and compliance standards. Troubleshoot and resolve technical issues, provide L2/L3 support. Optimise bot performance and automation processes. LA International is a HMG approved More ❯