Kansas City Metropolitan Area. Responsibilities: Architect and develop Python-based microservices (FastAPI, Flask, or custom). Translate data formats (JSON, Parquet, Avro) and develop automation/scripting solutions. Drive rootcauseanalysis and troubleshooting across staging and production. Lead integration efforts with DevOps, security, and cloud infrastructure teams. Guide CI/CD improvements, observability tooling, and service More ❯
support of the companies technologies, including email, voicemail, and other enterprise systems. Take an active role in the Incident Response and Problem Management processes, representing the desktop environment. Provide rootcauseanalysis for problems and measures to mitigate future occurrences. Supervise the daily activities of the end user and desktop support function including, but not limited to More ❯
Milton Keynes, Buckinghamshire, England, United Kingdom Hybrid / WFH Options
Lorien
the Group IT Service Desk Ensuring all changes to production services are subject to change control Producing and update documentation to improve the efficiency and effectiveness of systems Conducting rootcauseanalysis following unplanned disruptions to improve system availability Conduct Problem Management of repeat incidents affecting business services Experience with incident management systems, identify incident trends and More ❯
qualification, audits, and corrective actions. Review and approve quality documentation to ensure compliance with specifications and regulatory standards. Monitor product performance and customer feedback for areas of improvement. Conduct rootcauseanalysis and implement corrective and preventive actions. Collaborate with R&D, manufacturing, and supply chain teams to embed quality throughout the product lifecycle. Manage internal and More ❯
Leatherhead, Surrey, England, United Kingdom Hybrid / WFH Options
Recruitvirt Ltd
data protection, recovery, and disaster recovery (DR) across on-prem and hybrid workloads. Manage incidents, service requests, and change controls via standard ITIL-based processes. Lead and participate in rootcauseanalysis for infrastructure-related incidents and issues. Maintain and update detailed technical documentation and configuration records. Act as a senior point of contact for customers, attending More ❯
San Diego, California, United States Hybrid / WFH Options
Gridiron IT Solutions
using Infrastructure as Code with Terraform and configuration management tools like Ansible Automate repetitive tasks to eliminate toil and drive consistency + repeatability Actively participate in incident response and root-causeanalysis, support a blameless post-mortems culture Qualifications: Eligible for Top Secret/SCI Security clearance 5+ Years Experience working in a Security culture Experience working More ❯
Work collaboratively with development, DevOps, and security teams to ensure data governance, compliance, and operational efficiency. Implement monitoring and alerting solutions using tools like CloudWatch, Datadog, or Prometheus. Conduct rootcauseanalysis (RCA) and develop long-term preventive strategies. Maintain and enforce database standards, documentation, and operational procedures. Required Qualifications: 7+ years of experience in database engineering More ❯
/experience with workstations, laptops, printers, smartphones, and tablets. Working knowledge/experience of PC imaging tools, diagnosis and remote-control tools, documentation, and ticketing. Excellent troubleshooting, problem solving, & rootcauseanalysis skills. Excellent customer service skills - Must be able to interact in person with customers who are experiencing network/technology related issues. Ability and willingness More ❯
and introducing new tools or automations. Understanding of the ITIL framework or service management methodology Experience in incident management, including owning the response process for urgent issues and ensuring rootcauseanalysis is performed and documented. Excellent communication skills, both written and verbal. Hands-on experience with service desk tools, e.g. Jira, Zendesk, ServiceNow. If that's More ❯
in alignment with ITSM principles to ensure consistent service quality. Incident & Change Management - Direct incident, problem, and change management activities in accordance with ITIL standards, ensuring rapid issue resolution, root-causeanalysis, and long-term service stability. Technical Collaboration - Liaise closely with Salesforce and AWS specialists to coordinate upgrades, patch releases, and enhancements, ensuring minimal service disruption More ❯
North West London, London, United Kingdom Hybrid / WFH Options
SEFE MARKETING & TRADING LIMITED
Oracle estate is secured, up-to-date with security patches, operating system updates, and aligned with company policies. Maintain proper database security and monitor compliance. Youll provide prompt, precise rootcauseanalysis and work closely with IT Development and Infrastructure teams to resolve issues and improve performance. Disaster Recovery & Incident Management: Participate in disaster recovery exercises, ensure More ❯
bradford, yorkshire and the humber, united kingdom
Alscient
in alignment with ITSM principles to ensure consistent service quality. Incident & Change Management - Direct incident, problem, and change management activities in accordance with ITIL standards, ensuring rapid issue resolution, root-causeanalysis, and long-term service stability. Technical Collaboration - Liaise closely with Salesforce and AWS specialists to coordinate upgrades, patch releases, and enhancements, ensuring minimal service disruption More ❯
adjustments. Coordinate with procurement agents, suppliers, and repair centers to ensure timely and quality repairs. Monitor inventory levels of repairable spares and manage the logistics of parts movement. Conduct rootcauseanalysis to identify and address recurring issues with spare parts. Maintain detailed records of repair activities, costs, and inventory status. Ensure compliance with company policies and More ❯
Cheltenham, Gloucestershire, South West, United Kingdom
LM RECRUITMENT SOLUTIONS LTD
. Lead the adoption of proactive monitoring and automation tools to help transition the business from reactive support to predictive, streamlined operations. Lead on service management excellence ticket discipline, rootcauseanalysis, and continuous improvement. Ensure all backup strategies (on-premises and cloud) are fit for purpose, with robust monitoring and management to maintain data integrity and More ❯
. Lead the adoption of proactive monitoring and automation tools to help transition the business from reactive support to predictive, streamlined operations. Lead on service management excellence ticket discipline, rootcauseanalysis, and continuous improvement. Ensure all backup strategies (on-premises and cloud) are fit for purpose, with robust monitoring and management to maintain data integrity and More ❯
Cheltenham, Gloucestershire, South West, United Kingdom
LM RECRUITMENT SOLUTIONS LTD
. Lead the adoption of proactive monitoring and automation tools to help transition the business from reactive support to predictive, streamlined operations. Lead on service management excellence ticket discipline, rootcauseanalysis, and continuous improvement. Ensure all backup strategies (on-premises and cloud) are fit for purpose, with robust monitoring and management to maintain data integrity and More ❯
troubleshoot application functionality within VDI sessions in partnership with application owners Create and manage desktop pools in Horizon Administrator/Console, including both persistent and non-persistent configurations Perform rootcauseanalysis for recurring issues and implement permanent fixes Maintain user entitlement mappings and access to appropriate VDI pools Monitor system health and generate reports on performance More ❯
device, configuration, able to easily navigate the CLI, deploy applicable patches, and make configuration changes as needed. Have strong analytical and problem-solving skills. Candidates are expected to perform rootcauseanalysis to troubleshoot & identify issues at all layers of the network. Expertise with WAN/Transport and IP routing technologies and protocols, candidates should have an More ❯
Reliability: Establish deep observability into cloud network paths, health indicators, and latency measurements. Apply SRE practices to ensure uptime, fast incident response, and continuous improvement. Drive performance optimization and rootcauseanalysis through telemetry, analytics, and runbooks. Define and monitor SLAs, SLOs, and KPIs related to cloud connectivity experience. Security, Compliance & Governance: Ensure secure design and enforcement More ❯
greater value for money (VFM) and improved service delivery. Key Responsibilities: Discovery Phase Review data service contracts and assess current SLAs for efficiency, effectiveness, and value for money. Behavioural Analysis Engage with service providers and consumers to understand behaviours, motivations, and challenges. Conduct rootcauseanalysis to identify areas for innovation in service use. Operating Model More ❯
for its advertiser platform. This team sits at the intersection of Engineering, Product, and Operations , supporting tooling, workflows, and program execution. You will play a key role in data analysis, project execution, and cross-functional collaboration , helping to improve workflows, streamline tooling, and deliver insights that directly enhance operational efficiency. Experience Level: 2–5 years (up to a maximum … of 7 years). Please note: senior-level candidates will not be considered for this role. What You’ll Do Conduct qualitative & quantitative analysis of projects post-launch, providing insights and identifying areas of improvement Translate operational issues into data-driven recommendations Partner with Systems Program Managers on tooling changes, root-causeanalysis, and workflow audits … communicate updates to stakeholders Assist in shaping the tech roadmap , ensuring alignment with team goals and priorities What We’re Looking For Top 3 Must-Haves Strong SQL & data analysis skills Project management and ability to drive execution Excellent communication & stakeholder collaboration Good to Have Experience in operational or vendor-facing environments Familiarity with process design tools (Lucid, Figma More ❯
Sheffield, South Yorkshire, United Kingdom Hybrid / WFH Options
Experis
skills. Self-driven and be able to work independently with minimal supervision Requirements gathering in a technical environment - Cloud and on-premise infrastructure platforms Strong process mapping skills Strong analysis skills Identification of use cases Technical understanding of infrastructure technologies Experience of working closely with IT developers/engineers and operational teams Rootcauseanalysis skills More ❯
handsworth, yorkshire and the humber, united kingdom Hybrid / WFH Options
eTeam
skills. • Self-driven and be able to work independently with minimal supervision • Requirements gathering in a technical environment – Cloud and on-premise infrastructure platforms • Strong process mapping skills • Strong analysis skills • Identification of use cases • Technical understanding of infrastructure technologies • Experience of working closely with IT developers/engineers and operational teams • Rootcauseanalysis skills More ❯
to the following protocols: OSPF, BGP, MPLS, VRF's, VPN's • Support technical exchange meetings to help resolve other ISP and application owner issues. • Support problem management to conduct rootcauseanalysis of recurring issues. • Provide day-to-day Operations and Maintenance (O&M) support for incident management events and outages. • Collaborate with IT staff on projects … and initiatives. • Work with corresponding technical support teams as required to resolve network traffic concerns. • Utilize monitoring tools and log collectors to provide in-dept analysis of traffic anomalies and issues. • Candidate will be on call after hours on a rotational basis. • Not required but experience with Palo Alto, Cisco Adaptive Security Appliance (ASA), and Forcepoint firewall equipment and More ❯
serving the US national security community and allies. Job Description As a Systems Engineer, you will support on-site national security mission systems including Design & Architecture, Integration & Testing, Data Analysis, and Capability Development. You will be directly involved in systems planning, installation, commissioning, site surveys, and troubleshooting for custom solutions serving the US national security community and allies. This … with software and operating systems independently. • Work with production environments spanning 100+ servers across multiple sites. • Manage and troubleshoot custom applications and services for data movement and delivery. • Perform root-causeanalysis for complex production environment issues. • Support application servers, database servers, and interdependent worker services. • Create installation guides and as-built documentation. • Provide technical guidance on … Operating System, specifically CentOS/Red Hat 7 and 8. • Experience with KVM virtualization from day one. • 2+ years of direct experience in systems design & architecture, integration & testing, data analysis, and capability development. • Mid-scale deployment operations experience with 100+ servers across multiple sites. • Experience with custom applications and services deployment and management. • Demonstrated experience troubleshooting and solving problems More ❯