Incident Management Jobs in the UK

226 to 250 of 1,551 Incident Management Jobs in the UK

Information Technology Service Delivery Manager

London Area, United Kingdom
Venquis
business objectives, meet service level agreements (SLAs), and provide a seamless user experience. Key Responsibilities Oversee and improve the entire lifecycle of key ITIL practices and be proficient in Incident, Problem, Change, Asset, Transition and Service Request management, ensuring timely resolution and fulfilment in alignment with service level agreements (SLAs). Manage End to End Service Provision by … acting as a point of escalation, ensuring seamless service delivery by adhering to established systems, processes, and methodologies. Manage Service Communications by providing regular incident & maintenance and downtime updates as well as reports to Senior Management on all aspects of service performance, ensuring transparency and timely communication on any issues. Manage SLA’s/SLO’s and develop … business to provide consultative input on solution changes, updates, and upgrades, and assist in defining and communicating accordingly. Take ownership of Major incidents, coordinating resolution efforts and conducting post-incident reports for internal distribution. On occasion these may occur out of hours. Manage the escalation of incidents and service requests to 3rd line support or external vendors as needed. More ❯
Posted:

Manager, Cloud Site Reliability Engineering

Reading, England, United Kingdom
Barracuda Networks Inc
reliability, establish SLOs, and implement monitoring and alerting strategies Team Leadership: Build, mentor, and grow a high-performing SRE team while fostering a culture of innovation and continuous improvement Incident Management: Establish and optimize incident response processes, lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve … excellence Technical Strategy: Develop and execute technical roadmaps aligned with business goals and scaling requirements Security Integration: Ensure security best practices are embedded in infrastructure and operational processes Knowledge Management: Establish documentation standards and knowledge sharing practices across the organization Vendor Management: Evaluate and manage relationships with technical vendors and service providers Operational Excellence: Drive continuous improvement in … Deep understanding of distributed systems, cloud platforms (AWS/GCP/Azure), and modern infrastructure technologies Operational Excellence: Strong background in implementing SLOs, SLIs, and SLAs, with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to More ❯
Posted:

Application Support Senior Analyst – Commodities Markets - VP

Belfast, Northern Ireland, United Kingdom
Hybrid / WFH Options
11037 Citibank, N.A. United Kingdom
career move that will put you at the heart of a global financial institution? Then bring your skills in analysis, problem solving and communication to Citi’s Commodities Production Management team By Joining Citi, you will become part of a global organisation whose mission is to serve as a trusted partner to our clients by responsibly providing financial services … Commodities Production team to provide stability and support to the Commodities trading business (Trading, Operations, Middle Office, Downstream partners) . Deliver efficiency and stability through automation, failover testing and incident and problem management lifecycles. Use Site Reliability Engineering methods to improve availability and performance of applications. Deliver TOIL reduction via automation, resiliency and observability. Manage, triage, communicate and … solutions from a tooling catalogue. What we’ll need from you: Relevant experience in an Application Support role, ideally with Site Reliability Engineering experience. Knowledge/experience of Problem Management Tools and the Incident management process. Demonstratable experience in providing Automation and TOIL reducing solutions. Ability to demonstrate strong analytical and technical skills. Effective written and verbal More ❯
Posted:

Service Desk Analyst – Overseas

London, England, United Kingdom
Hybrid / WFH Options
Transputec Ltd
assistance. To provide technical support; answering support queries via phone, email & self service. To maintain a high degree of customer service for all support queries, adhering to all service management principles (Incident Management Process). To take ownership of user incidents and be proactive when dealing with user issues. To log all calls on the Service Desk … toolset. To capture accurate and truthful information from users on all incidents & requests, ensuring CI relationships are highlighted (Asset Management). Respond to requests from users and help them resolve hardware or software requirements. Support users in the use of IT equipment by providing necessary guidance and advice. To escalate more complex calls having captured all relevant information in … users via remote assistance, providing a high level of resolution at first contact To maintain a high degree of customer service for all support queries, adhering to all service management principles (Incident Management Process) To take ownership of user incidents and be proactive when dealing with user issues To log all calls on the Service Desk toolset More ❯
Posted:

Technical Support and Operations Engineer

Manchester Area, United Kingdom
European Tech Recruit
providing top-tier support and fostering positive relationships. Collaborative Problem Solving: Work closely with cross-functional teams to diagnose and resolve technical challenges efficiently. Operational Excellence: Contribute to service management reporting, incident and problem management, and service improvement initiatives. Technical Troubleshooting: Leverage your software expertise to triage, reproduce, and resolve client-reported defects within SLAs. System Maintenance … Security: Ensure system compliance, manage data migrations, and maintain a secure operating environment. Automation & Optimization: Identify and implement opportunities to automate processes and enhance client engagement. Incident Management: Effectively track and manage incidents using ticketing systems (e.g., Jira), ensuring timely resolution. Your Skills and Attributes: Client-Centric Communication: Exceptional verbal and written communication skills with the ability to … database querying (SQL Server, Oracle), and scripting languages (Bash, Python). Cloud Expertise: Experience working in an AWS environment. Operational Acumen: Familiarity with virtualization, network protocols, and troubleshooting. Time Management & Prioritization: Ability to manage multiple priorities and meet deadlines in a fast-paced environment. Team Collaboration: Proven ability to build rapport with clients and collaborate effectively with internal teams. More ❯
Posted:

Information Technology Service Delivery Manager

London, England, United Kingdom
Venquis
business objectives, meet service level agreements (SLAs), and provide a seamless user experience. Key Responsibilities Oversee and improve the entire lifecycle of key ITIL practices and be proficient in Incident, Problem, Change, Asset, Transition and Service Request management, ensuring timely resolution and fulfilment in alignment with service level agreements (SLAs). Manage End to End Service Provision by … acting as a point of escalation, ensuring seamless service delivery by adhering to established systems, processes, and methodologies. Manage Service Communications by providing regular incident & maintenance and downtime updates as well as reports to Senior Management on all aspects of service performance, ensuring transparency and timely communication on any issues. Manage SLA’s/SLO’s and develop … business to provide consultative input on solution changes, updates, and upgrades, and assist in defining and communicating accordingly. Take ownership of Major incidents, coordinating resolution efforts and conducting post-incident reports for internal distribution. On occasion these may occur out of hours. Manage the escalation of incidents and service requests to 3rd line support or external vendors as needed. More ❯
Posted:

Manager, Cloud Site Reliability Engineering

Reading, England, United Kingdom
Barracuda
reliability, establish SLOs, and implement monitoring and alerting strategies Team Leadership: Build, mentor, and grow a high-performing SRE team while fostering a culture of innovation and continuous improvement Incident Management: Establish and optimize incident response processes, lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve … excellence Technical Strategy: Develop and execute technical roadmaps aligned with business goals and scaling requirements Security Integration: Ensure security best practices are embedded in infrastructure and operational processes Knowledge Management: Establish documentation standards and knowledge sharing practices across the organization Vendor Management: Evaluate and manage relationships with technical vendors and service providers Operational Excellence: Drive continuous improvement in … Deep understanding of distributed systems, cloud platforms (AWS/GCP/Azure), and modern infrastructure technologies Operational Excellence: Strong background in implementing SLOs, SLIs, and SLAs, with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to More ❯
Posted:

IT Service Strategy Manager

Edinburgh, United Kingdom
Royal London
Type: Permanent Working style: Hybrid 50% home/office based Closing Date: 3rd July 2025 The main purpose of the role is to support the Head of IT Service Management in defining, executing, and delivering the IT Service strategy in Royal London Group, as well as manage ITIL processes. About the role Own and drive the strategy for Service … evolving and broader Business and IT goals and needs. Communicate IT Service strategy to relevant stakeholders and provide regular progress updates. Own multiple ITIL processes (for example Change Enablement, Incident, Problem, Request, Configuration Management), drive continuous improvement of these, and obtain/retain buy-in to the processes. Monitor and report on service performance, using data to inform … decisions and improvements. Build excellent working relationships with internal and external teams to the function. Collaborate with cross-functional teams including incident management, service desk, and infrastructure to optimise service delivery. Prepare business cases as required to receive funding for proposed transformation activities and present to senior committees. Sponsor and drive the adoption of Service Now across the More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Client Platforms Engineer

London, England, United Kingdom
DV Trading LLC
Since spinning out of a large brokerage firm in 2016, DV Trading has rapidly scaled as an independent proprietary trading firm utilizing its own capital, trading strategies, and risk management methodologies to provide liquidity to worldwide financial markets and hedging opportunities to commodity producers and users. Now, DV group affiliates include two broker dealers, a cryptocurrency market making firm … and properly routed in our ticketing system. Technical Troubleshooting: Identify and diagnose hardware, software, and application problems. Resolve issues efficiently or escalate to the appropriate team for further investigation. Incident Management: Follow established procedures for incident management, ensuring timely and accurate resolution of technical problems while keeping the end-users informed of progress. Request Fulfillment: Handle … efficient manner. Documentation: Maintain accurate records of all service desk interactions, including incidents and service requests. Document solutions and create knowledge base articles to facilitate self-service. IT Asset Management: Assist in managing IT assets, including inventory, tracking, and allocation of hardware and software licenses. Continuous Improvement: Actively participate in ongoing training and development to enhance technical skills and More ❯
Posted:

Online Support Engineer

Manchester, England, United Kingdom
Betfred Group
The team is responsible for supporting all applications and services used across the Betfred Digital Business as well as external customers, liaising with 3rd parties where relevant, reacting to incident escalations, as well as providing platform expertise to aid in the investigation and resolution of Live Incidents/Bugs. Working closely with internal Engineering teams and Product Owners, in … ensure effective communication and transfer of knowledge between Engineering teams and IT Operations, fundamentally minimising resolution times and turnaround of code-fixes. Job Duties Providing a technical input to Incident Reports for P1 Critical/P2 High priority issues. Daily call management of escalated tickets, providing regular customer/business updates on assigned support calls, and working in … for Internal and External Engineering/Support teams to aid in the timely delivery of FE bug fixes, with an in-depth knowledge of open support cases and statuses. Management of ‘Small-Works’ bugs across various digital platforms, co-ordinating with Engineering teams (internal squads and external) and managing release of code-fixes into Production environments. Aiding in the More ❯
Posted:

CYBER SECURITY LEAD - SC, CYBER, ASSURANCE

West Midlands, United Kingdom
Adecco
to stakeholders. * Stay informed on emerging threats, technologies, and regulatory changes. * Support internal and external audits and regulatory inspections. ________________________________________ Essential Skills & Experience: * Proven experience in cyber security operations and incident management. * Strong knowledge of ISO 27001, NIST, and related frameworks. * Experience with GRC processes and tools. * Familiarity with SOC operations and threat detection technologies. * Excellent understanding of the cyber … Must be a British National and SC cleared or eligible. ________________________________________ Desirable: * Experience in regulated or high-security environments. * Knowledge of additional frameworks such as COBIT, ITIL, or GDPR. * Project management experience or certifications (e.g., PRINCE2, Agile). ________________________________________ Disclaimer: Adecco is acting as an Employment Agency. We are an equal opportunities employer and a listed supplier for this role. Your … CV will be handled with the utmost confidentiality, and we will always consult you before submitting it to any client. ________________________________________ Keywords: Cyber Security Lead, Incident Management, InfoSec, Cyber Assurance, ISO 27001, NIST, CISSP, CISM, GSLC, CCP, GIS, GRC, SOC, Risk Management, Threat Intelligence, Defence, Stakeholder Engagement, SC Clearance, Cyber Compliance, Security Governance, Security Awareness, West Midlands, Cyber More ❯
Employment Type: Permanent
Salary: £62000 - £73000/annum Benefits
Posted:

Linux Sys Admin Manager

City of London, London, United Kingdom
Hybrid / WFH Options
REC SOLUTIONS LIMITED
development, QA and multiple production trading systems including some belonging to third party clients. Collaborate with development, networks, ops and product teams on strategic IT initiatives. Assist with planning, management and resource allocation of inter-departmental projects alongside the PM team. Oversee incident management, root cause analysis, and rapid resolution of system outages or performance degradation. Ensure … compliance of procedures such as change management, patch management and security and audit processes. Assist in the maintenance of these procedures. Support regular security audits and penetration tests, addressing findings and oversee any remediation work. Improve system monitoring, alerting, documentation, operating procedures and incident response processes. Manage, mentor, plan and coordinate the activities of both teams. Required … Experience Ideally 7+ years Linux system administration experience with at least 3 years in a managerial or team lead role. Strong expertise with RHEL-based systems, including installation, ongoing management, monitoring, performance tuning, system security hardening, etc. Proven track record of managing geographically distributed teams, including senior engineers and tier-1/2 support staff including on-call and More ❯
Employment Type: Permanent, Work From Home
Posted:

Lead Systems Administrator - Linux

City of London, London, United Kingdom
Hybrid / WFH Options
REC SOLUTIONS LIMITED
development, QA and multiple production trading systems including some belonging to third party clients. Collaborate with development, networks, ops and product teams on strategic IT initiatives. Assist with planning, management and resource allocation of inter-departmental projects alongside the PM team. Oversee incident management, root cause analysis, and rapid resolution of system outages or performance degradation. Ensure … compliance of procedures such as change management, patch management and security and audit processes. Assist in the maintenance of these procedures. Support regular security audits and penetration tests, addressing findings and oversee any remediation work. Improve system monitoring, alerting, documentation, operating procedures and incident response processes. Manage, mentor, plan and coordinate the activities of both teams. Required … Experience Ideally 7+ years Linux system administration experience with at least 3 years in a managerial or team lead role. Strong expertise with RHEL-based systems, including installation, ongoing management, monitoring, performance tuning, system security hardening, etc. Proven track record of managing geographically distributed teams, including senior engineers and tier-1/2 support staff including on-call and More ❯
Employment Type: Permanent, Work From Home
Posted:

Lead Systems Administrator - Linux

London, England, United Kingdom
Hybrid / WFH Options
Chicago Organizing
development, QA and multiple production trading systems including some belonging to third party clients. Collaborate with development, networks, ops and product teams on strategic IT initiatives. Assist with planning, management and resource allocation of inter-departmental projects alongside the PM team. Oversee incident management, root cause analysis, and rapid resolution of system outages or performance degradation. Ensure … compliance of procedures such as change management, patch management and security and audit processes. Assist in the maintenance of these procedures. Support regular security audits and penetration tests, addressing findings and oversee any remediation work. Improve system monitoring, alerting, documentation, operating procedures and incident response processes. Manage, mentor, plan and coordinate the activities of both teams. Required … Experience Ideally 7+ years Linux system administration experience with at least 3 years in a managerial or team lead role. Strong expertise with RHEL-based systems, including installation, ongoing management, monitoring, performance tuning, system security hardening, etc. Proven track record of managing geographically distributed teams, including senior engineers and tier-1/2 support staff including on-call and More ❯
Posted:

OT Cyber Security Analyst

Reading, Berkshire, United Kingdom
Hybrid / WFH Options
Thames Water Utilities Limited
million customers. In this role, you will be responsible for maintaining SecOps solutions, controls, and processes across the organisation, while mentoring and leading the SOC team to ensure effective management of OT alerts and incidents. This position requires a deep understanding of SecOps concepts, technologies, and best practices, specifically across IT and OT environments. You will be tasked with … ensuring robust incident management, proactive threat detection, and continuous improvement of our security posture. Strong communication and collaboration skills are essential as you will work closely with cross-functional teams to mitigate risks and protect Thames Water's essential services. What you'll do as an OT Senior Cyber Security Analyst Contextualise OT Specific Threats: • Understand the Operational … and proportionate controls. • Perform proactive activities such as threat hunting to uncover vulnerabilities and ensure continuous risk reduction. • Provide tangible metrics to demonstrate risk reduction and reduced technical debt. Incident Readiness & Response: • Lead the incident triage and response process, ensuring effective management and remediation of cyber security incidents. • Improve incident management by reducing business impacts More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

IT Environments Manager

Manchester Area, United Kingdom
JSS Search
ensure optimal setup, configuration, maintenance, and security of both the development and production environments, ensuring that IT systems and infrastructures are reliable, scalable, and secure. Key Responsibilities Leadership Environment Management: Deployment & Automation: Performance & Scalability: Security & Compliance: Collaboration & Stakeholder Management: Documentation & Reporting: Incident Management & Problem Resolution: Capacity Planning: Escalate issues as appropriate. Manage assigned risks and issues. … communication skills, with the ability to collaborate across technical and non-technical teams. Preferred Qualifications : Experience with container orchestration platforms (e.g., Kubernetes). Familiarity with agile methodologies and project management tools (e.g., Jira, Confluence). More ❯
Posted:

Service Desk Analyst (SC Cleared) 24 x 7 Shift

Nottingham, England, United Kingdom
Hybrid / WFH Options
Computacenter AG & Co. oHG
service goes live. The analyst will manage technical aspects of service delivery according to standards and procedures, including problem investigation, support documentation, and technical coaching. What you’ll do Incident/Request Management 80% Manage incidents routed from First Level analysts and resolve within knowledge and contract limits. Maintain technical knowledge related to customer-specific applications. Progress and … close incidents satisfactorily in the incident management system. Coordinate with team and other Service Analysts/customers on open incidents to ensure SLAs are met. Escalate potential service issues to the Team Leader. Collaborate with the Team Leader on specific projects as needed. Knowledge Management 20% Review and update technical support documents and procedures based on experience More ❯
Posted:

Service Desk Analyst

Tewkesbury, England, United Kingdom
PentenAmio UK
from end users, either via email, ticketing or over the phone. Provide advanced troubleshooting for customer incidents. Fulfil service requests. Hardware provisioning activities. Help lead and support the Major Incident Management Process. Help lead and support the Problem Management Process. Create and update process documentation and user guides. Collaborate with other PentenAmio teams to drive incident and problem resolution. Assist the Operations team with inter-team knowledge transfer and workload integration. Providing management updates for service or customer issues What we're looking for Technical Proficiency and good troubleshooting ability Experience with Windows and Linux Desktop environments A fundamental understanding of IT networks Certification in or a good working knowledge of ITIL Proficient in … Desk experience Customer focused Excellent communicator face to face and online Desirable skills: Experience in Secure communications Unix experience Knowledge of Wi-Fi networks Experience with Apple mobile device management Qualifications & Eligibility To be eligible for this position you must: Be an British citizen; and Hold or be able to obtain a NSV-DV clearance. Previous experience as a More ❯
Posted:

Cloud Security Engineer Tombola

Sunderland, United Kingdom
CyberNorth
You'll be hands-on, designing, implementing, and managing top-notch security solutions across all our cloud environments. You'll also play a key part in developing our vulnerability management program, working closely with our operational support, infrastructure, and development teams. Plus, you'll be right in the thick of security event monitoring, threat intelligence, and incident management … policy, standards, and guidelines. Threat Intelligence: You'll monitor and apply current and emerging threat intelligence, using tools like Google Threat Intelligence to proactively spot and tackle digital threats. Incident Response: You'll actively monitor for security incidents and jump into action with our incident response teams to contain, investigate, and prevent future security hiccups. Defining Controls: You … including firewalls, WAF, anti-virus, and O365 compliance & security centre. Familiarity with NIST (CSF Framework 2.0), ISO 27001, PCI-DSS, and GDPR. Experience operating and managing SIEM solutions, vulnerability management tools, and secure configuration tooling. Ability to use PowerShell and Python scripting for security automation. Experience working in or with agile and/or SecOps oriented teams. A proven More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Cloud Security Engineer - Sunderland (Hybrid) Sunderland, UK

Sunderland, United Kingdom
Hybrid / WFH Options
Tombola
You'll be hands-on, designing, implementing, and managing top-notch security solutions across all our cloud environments. You'll also play a key part in developing our vulnerability management program, working closely with our operational support, infrastructure, and development teams. Plus, you'll be right in the thick of security event monitoring, threat intelligence, and incident management … policy, standards, and guidelines. Threat Intelligence: You'll monitor and apply current and emerging threat intelligence, using tools like Google Threat Intelligence to proactively spot and tackle digital threats. Incident Response: You'll actively monitor for security incidents and jump into action with our incident response teams to contain, investigate, and prevent future security hiccups. Defining Controls: You … WAF, anti-virus, and O365 compliance & security centre . Familiarity with NIST (CSF Framework 2.0), ISO 27001, PCI-DSS, and GDPR . Experience operating and managing SIEM solutions , vulnerability management tools, and secure configuration tooling. Ability to use PowerShell and Python scripting for security automation. Experience working in or with agile and/or SecOps oriented teams . A More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

IT Support Officer Apprentice

Birmingham, England, United Kingdom
Getting In Limited
Description We are now searching for 3 candidates to join us on a IT Azure Level 3 Apprenticeship. Under supervision, you will assist and provide a key second line incident management support service to all users via telephone, remote support software and site visit. Duties will be to: Provide a key second line Incident Management, Request … issues to enable resolution Respond to and resolving all incidents within agreed service levels Escalate unresolved incidents to third line support specialists with full information Provide day-to-day management and monitoring of team specific ITBM support queue within agreed Service Level Agreements updating and or closing tickets and providing confirmation to the Customer as required Update the Ticket More ❯
Posted:

Head of Corporate IT

Manchester, United Kingdom
Vix Technology
Collaboration: Partner with cross-functional teams to ensure IT solutions support business initiatives and integrate seamlessly. Compliance & Security: Maintain regulatory compliance (ISO27001, GDPR) and implement robust cybersecurity measures. Vendor Management: Oversee vendor relationships, procurement, and contracts to deliver high-quality, cost-effective IT services. Resilience & Incident Management: Lead IT incident resolution efforts, ensuring swift recovery and … clear stakeholder communication. Change Management: Drive successful adoption of new systems and processes while supporting ongoing improvements. What You Bring to the Role We're looking for someone who can: Lead Globally: Proven experience managing international teams, balancing local needs with global priorities, and navigating cultural diversity. Think Strategically: Expertise in developing IT strategies that align with business growth … ability to deliver cost-effective, high-quality IT services through vendor and contract management. Drive Innovation: Familiarity with digital transformation, cloud computing, and emerging technologies. Certifications in IT service management or project management are highly desirable. What's in it for you? Besides the opportunity to work for a global company that is customer and people focused, we More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

United Kingdom
luupli
maintain cloud-based infrastructure using AWS and Terraform. - Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure components. 2. Monitoring and Incident Management: - Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues. - Participate in incident response and root cause analysis efforts to … Security and Compliance: - Collaborate with security teams to implement best practices for securing cloud infrastructure and services. - Ensure compliance with relevant industry standards and regulations. 6. Deployment and Release Management: - Support CI/CD pipelines for application deployments and updates. - Contribute to the design and implementation of deployment strategies that promote zero-downtime releases. 7. Documentation and Knowledge Sharing … Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution procedures. - Participate in knowledge sharing with team members to enhance overall expertise and skill sets. Requirements: 1. Education and Experience: - Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience). - Proven experience as a Site Reliability Engineer or similar More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer

London, England, United Kingdom
luupli
maintain cloud-based infrastructure using AWS and Terraform. - Implement and enhance infrastructure-as-code (IaC) practices using Terraform to ensure reproducibility and scalability of infrastructure components. 2. Monitoring and Incident Management: - Develop and maintain monitoring solutions to proactively identify performance bottlenecks, system outages, and other potential issues. - Participate in incident response and root cause analysis efforts to … Security and Compliance: - Collaborate with security teams to implement best practices for securing cloud infrastructure and services. - Ensure compliance with relevant industry standards and regulations. 6. Deployment and Release Management: - Support CI/CD pipelines for application deployments and updates. - Contribute to the design and implementation of deployment strategies that promote zero-downtime releases. 7. Documentation and Knowledge Sharing … Maintain clear and up-to-date documentation for infrastructure configurations, processes, and incident resolution procedures. - Participate in knowledge sharing with team members to enhance overall expertise and skill sets. Requirements: 1. Education and Experience: - Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent practical experience). - Proven experience as a Site Reliability Engineer or similar More ❯
Posted:

Data Center Engineering Operation Engineer, DCEO

London, United Kingdom
Amazon
performance in the aspects of safety, security, availability, productivity, capacity and efficiency. We are looking for a proven Data Center Engineering Operations (DCEO) Engineer with experience in critical facilities management, and a result-driven individual with strong technical understanding and the drive and vision to take our data center operations to the next level. The role will be reporting … and respond to emergency services for critical systems including switchgear, generators, UPS systems, power distribution equipment, chillers, cooling towers, computer room air handlers, building monitoring systems, etc.; Generate change management requests & incident management tickets for Data Center facility; Engage in non-office on-call responsibilities and respond promptly to emergency situations or incident, such as power … Oversight of third-party vendors to ensure that all work performed is in accordance with established safety protocols, best practices, and local legislation; Utilization of administrative tools for change management, ticketing, and asset management; Supporting project/stakeholder team and vendor activities; Overseeing of all Commissioning activities; Completion of assigned project tasks and managing associated reporting metrics; Engagement More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Incident Management
10th Percentile
£27,000
25th Percentile
£37,170
Median
£55,000
75th Percentile
£68,750
90th Percentile
£94,000