Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Bath, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Glasgow, Scotland, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Brighton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Reading, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Hemel Hempstead, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Watford, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Portsmouth, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Crawley, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
Hounslow, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
rapid and reliable code delivery across ● Work closely with the engineering team to support microservices architecture, with focus on latency-sensitive and high-availability services. ● Monitor system performance, conduct rootcauseanalysis, and implement observability best practices (metrics, logging, tracing). ● Harden infrastructure and deployments with infrastructure as code (Terraform/CDK/CloudFormation). ● Lead incident More ❯
London, England, United Kingdom Hybrid / WFH Options
Input Output (IOHK)
define testing objectives and improve quality processes Mentorship: Guide junior test engineers, fostering continuous learning and excellence Test automation: Build and enhance automation tools and CI/CD integration Rootcauseanalysis: Lead troubleshooting and debugging to resolve defects efficiently Align testing with business models and product focus, providing feedback on design and testing strategies Process improvements More ❯
London, England, United Kingdom Hybrid / WFH Options
Tide
Mentor engineering managers and team members, fostering continuous learning. Cross-Functional Collaboration: Work with Product, Infosec, Cloud, and other teams to align strategies. Incident Response: Support major incident responses, rootcauseanalysis, and preventive measures. What we are looking for Extensive experience in senior technical leadership roles in software or systems engineering. Proven leadership in SRE, QA More ❯
Chantilly, Virginia, United States Hybrid / WFH Options
Edgesource
like Grafana and Prometheus. Ensure comprehensive monitoring, logging, and alerting for all services. Reliability and Performance: Ensure high availability and performance of services. Conduct capacity planning, performance tuning, and rootcauseanalysis for incidents. Implement and maintain service level objectives (SLOs) and service level indicators (SLIs). Operational Excellence: Develop and enforce best practices for incident management More ❯
Bradford, England, United Kingdom Hybrid / WFH Options
Yorkshire Water
operations. Maintain and update system configurations and related documentation. Incident and Problem Management: Manage and resolve incidents and service requests related to network systems in a timely manner. Conduct rootcauseanalysis for recurring issues and implement solutions to prevent future occurrences. Document all support activities, including incident resolution steps and troubleshooting procedures. Supplier Management: Manage relationships More ❯
San Francisco, California, United States Hybrid / WFH Options
SoFi
and track service-level objectives (SLOs) and key performance indicators (KPIs) to measure the availability, performance, and cost efficiency of platform services. Own complex incident resolution, guiding teams on root-causeanalysis and ensuring continuous feedback loops to improve system resilience. What You'll Need: Experience & Education Bachelor's or Master's degree in Computer Science, Software More ❯
Virginia Beach, Virginia, United States Hybrid / WFH Options
CrowdStrike Holdings, Inc
Deploy and manage monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Datadog, Splunk, ELK). Implement automated self-healing mechanisms and proactive monitoring solutions. Lead incident response, postmortems, and rootcauseanalysis (RCA) to prevent future system disruptions. Ensure 24/7 system uptime through on-call rotation and escalation handling. Security, Compliance & IAM Implement Identity and More ❯
London, England, United Kingdom Hybrid / WFH Options
Funding Circle UK
applications, microservices, and infrastructure components. Support internal and external penetration testing engagements for Funding Circle applications, services, and cloud infrastructure. Contribute to vulnerability management processes, focusing on strategic remediation, rootcauseanalysis, and preventative measures. Assist in developing and implementing security automation across cloud infrastructure configuration, vulnerability management, and compliance monitoring. Contribute to the implementation of robust More ❯
London, England, United Kingdom Hybrid / WFH Options
GiveDirectly
data) to understand real-world needs and ship tools that directly support program delivery in the field. Debug and resolve production issues across our stack, with a focus on rootcauseanalysis and long-term fixes. Advocate for sustainable engineering practices, including testing, documentation, and monitoring Help shape our tech roadmap with an eye toward scale, maintainability … headaches, Parkinson’s disease, multiple sclerosis (MS) Neurodivergence, for example, attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder, dyslexia, dyspraxia, other learning disabilities Partial or complete paralysis (any cause) Pulmonary or respiratory conditions, for example, tuberculosis, asthma, emphysema Short stature (dwarfism) Traumatic brain injury Disability Status Select... PUBLIC BURDEN STATEMENT: According to the Paperwork Reduction Act of More ❯
City of London, London, United Kingdom Hybrid / WFH Options
dnevo Partners
and follow-up actions. Work closely with cross-functional teams on data-related projects and continuous improvement initiatives. Identify and investigate data quality issues, contributing to the development of rootcause analyses and solutions. Stay up-to-date with evolving data technologies, tools, and industry trends. Support the definition of data quality methodologies and standards across the business. More ❯
and follow-up actions. Work closely with cross-functional teams on data-related projects and continuous improvement initiatives. Identify and investigate data quality issues, contributing to the development of rootcause analyses and solutions. Stay up-to-date with evolving data technologies, tools, and industry trends. Support the definition of data quality methodologies and standards across the business. More ❯
Southampton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
regular security assessments, vulnerability scans, and monitor/respond to security incidents using tools such as Azure Sentinel and other security technologies (XDR, NDR, IDS/IPS, SIEM). RootCauseAnalysis and Compliance : Perform rootcauseanalysis for security incidents, implement corrective actions, and ensure compliance with industry regulations (GDPR, HIPAA). DevOps … containerization, knowledge of Azure Data Lake, Azure IoT Hub, and API tooling. Skills and Attributes Strong understanding of cloud security principles and best practices. Excellent problem-solving, analytical, and rootcauseanalysis skills. Effective communication and teamwork abilities. Competitive salary with eligibility for company bonus scheme (annual and quarterly payments). Private Medical Insurance and Medicash Scheme More ❯
Peterborough, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
Security Analyst Role: As a Level 3 Security Analyst, you will be at the forefront of our Security Operations Center (SOC), monitoring and responding to security incidents, performing threat analysis, and contributing to the continuous improvement of our cybersecurity services. You will work within a dynamic team, ensuring the protection of our clients' digital assets while maintaining service excellence. … growth and certification support. Responsibilities: Monitor, analyse, and respond to security events and incidents within the SOC, ensuring timely detection and resolution in line with SLAs. Conduct thorough threat analysis and vulnerability assessments to identify potential security risks and implement mitigation strategies. Develop and refine incident response plans and playbooks to enhance SOC effectiveness. Perform rootcauseanalysis (RCA) for high-priority incidents and contribute to service improvements. Provide expert recommendations on security measures and solutions to clients and colleagues. Engage in knowledge sharing within the SOC and wider teams to enhance security awareness. Participate in on-call rota for critical incident response and escalation. Work within designated shift patterns to ensure 24/ More ❯
Reading, England, United Kingdom Hybrid / WFH Options
Medirest Signature
are seeking a Problem Management Lead to support and enhance the end to end Problem Management process across all Thames Water business areas. You will be responsible for identifying root causes of major incidents, ensuring timely corrective actions, and driving continuous service improvement to reduce service disruptions and improve operational stability. What you’ll do as a Problem Management … Lead Problem Management: Support the maintenance and execution of the Problem Management process, policies, and procedures. Coordinate and track problem records to rootcause identification and permanent resolution. Ensure compliance with Problem Management standards across all service delivery teams and third party suppliers. Governance & Compliance: Ensure Problem Management activities are aligned with ITIL standards and regulatory obligations. Support … Management. Track and report all problem related KPIs, trends, and known errors. Supplier & Financial Management: Work within a multi supplier environment to ensure timely supplier engagement and progress on rootcauseanalysis and remediation. Support efforts to deliver cost effective operational solutions and reduce service disruption costs. Risk & Change Management: Collaborate with the Risk Lead to identify More ❯
Glasgow, Scotland, United Kingdom Hybrid / WFH Options
Ofgem
understanding of how data flows through systems from ingestion and transformation to representation. You’ll be confident using a range of techniques to draw out complex business needs, conduct rootcauseanalysis and translate findings into requirements that are accessible to both technical and non-technical audiences. You’ll also support the delivery of new services and … Responsibilities Change Management Role: Requirements elicitation, workshoping, interviewing, prototyping Business process modelling, analysing business needs, modelling system data Value mapping and benefits realisation management. Evaulating options, defining requirements, gap analysis and improving business processes Strategic analysis, internal and external environment analysis, rootcauseanalysis Stakeholder analysis and management Delivery of user-centric business … solutions with agile working Key Outputs and Deliverables Impact analysis of proposed work and prioritisation of work stack based on this. Comprehensive gap analyses based on assessment of existing business processes alongside detailed requirements for new processes Functional, non-functional, data and usability requirements and business rules, established with key business users. Requirements, modelled and documented, and translated to More ❯