Washington, Washington DC, United States Hybrid / WFH Options
General Dynamics Information Technology
Make an Impact: Analyzes customer requirements and provides highly innovative technical expertise on cloud computing techniques, technologies, infrastructure, DevOps and related cloud architecture Recognized Subject Matter Expert in systems analysis Maintains current knowledge of relevant technology as assigned May serve as a team or task lead Work with project managers and developers to plan and implement database projects Systems … roles/features, or backing up system configurations) Monitor system performance, CPU, memory, disk usage and event logs via Azure Monitor or custom scripts Respond to alerts and conduct rootcauseanalysis for any downtime or performance degradation Maintain data backups and recovery procedures for critical systems Periodically test restore processes to ensure COOP and disaster recovery More ❯
from requirements gathering to deployment Lead business and stakeholder teams to effectively translate business requirements into technical solutions keeping in mind best practices and industry standards Perform fit-gap analysis to identify opportunities to automate and make existing processes more efficient. Collaborate with various teams, including third party vendors, Enterprise Applications and Infrastructure teams on various projects and day … projects Build and foster client & peer relationships, partner with other teams to deliver mission critical applications Lead support teams and other team members to troubleshoot critical incidents by conducting rootcauseanalysis and identifying solutions Contribute to impact analysis during various application Release Cycles Own comprehensive technical documentation of integrations and other applications for document versions More ❯
efforts for complex software and hardware systems, ensuring the quality and reliability of our products. You will be involved in all stages of the software development lifecycle, from requirements analysis to deployment and maintenance. You will work closely with government engineers, managers, and other stakeholders to define test strategies, create test plans, develop and execute test cases, and analyze … and/or replacements. Interpret Computer Aided Design (CAD) models and drawings. Be familiar with the comprehensive design process including conceptual development, physical design, support to procurement, engineering test, root-causeanalysis, and integration and support of product qualification Review and analyze test plans, test cases, and test results created by the contractor to ensure completeness, accuracy … engineering processes, gates, and reviews Knowledge of Air Force engineering systems Ability to analyze supportability of the design Experience in Risk Management Experience in Configuration Management Experience in Logistical Analysis Active DoD clearance - Secret or above Pay Information Full-Time Salary Range: $86460 - $146982 Please note: This range is based on our market pay structures. However, individual salaries are More ❯
automated testing is an integral part of the IT delivery process Execute automated test scripts, analyse test results, and report defects, providing accurate and detailed information to aid in rootcauseanalysis and issue resolution Actively support the control environment, maintaining control effectiveness across existing controls, being mindful of emergent risks across IT Delivery functions Ensure Testing More ❯
SLOs, and error budgets for critical systems. -Monitor system performance, diagnose issues, and implement long-term fixes. Incident Response & Prevention -Coordinate high-impact incident response efforts and postmortems. -Drive rootcauseanalysis and long-term improvements. Tooling & Automation -Build and enhance internal tooling to improve deployment, monitoring, and reliability. -Implement infrastructure as code and CI/CD More ❯
Huntsville, Alabama, United States Hybrid / WFH Options
SAIC
proactively identify and resolve operational, tooling and process inefficiencies. On-Call after hours support may be required for critical systems. The candidate will collaborate with the customer to determine rootcauseanalysis and corrective actions. Key Responsibilities: Lead Red Hat Enterprise Linux (RHEL) administration and provide principal level leadership. Support IaaS environments with RHEL systems engineering, administration More ❯
checks to identify process defects Reporting Support the creation of routine reporting packs and dashboards for internal stakeholders, utilising and defining performance metrics - Service Level Agreements (SLAs) etc Conduct Analysis utilising tools such as Excel or PowerBI, to identify trends and opportunities for both system optimisation and improvement in operational performance Continuous Improvement - Operations process optimisation Proactively identify opportunities … generating and maintaining a knowledgeable Problem Solving Critically assess and collaboratively work alongside the function's operations team, managed service vendors and enterprise IT team to identify/support rootcauseanalysis and remediation of issues, incidents and escalation. Bridge the gap by translating business requirements to the Tech team and vice versa Vendor Management Maintain a More ❯
Accrington, England, United Kingdom Hybrid / WFH Options
World Options Ltd
governance across the UK operations and ensuring that every technology investment delivers tangible, measurable benefits that positively impact revenue, margin, and EBITDA. Key Responsibilities Requirements Management: Lead the collection, analysis, and prioritisation of functional and non-functional requirements across the three UK business units. Translate approved requirements into clear user stories, detailed acceptance criteria, and well-defined delivery plans … IT Manager. Establish and monitor effective Service Level Agreements (SLAs) and Operational Level Agreements (OLAs), curate a comprehensive knowledge base, measure user satisfaction (CSAT, NPS), and drive thorough incident root-cause analysis. Stakeholder Engagement & Communication: Act as a trusted advisor and key liaison for UK franchise partners, country management, and functional leads. Produce clear, data-driven status reports … UK IT Manager & Help Desk Team Development partners (internal & external) supporting UK systems UK Franchise partners & store owners Skills & Experience Proven track record of 7+ years in IT business analysis, product ownership, or IT governance roles, ideally within multi-site or franchise organisations operating in the UK. Demonstrable success in managing technology initiatives within complex, multi-platform environments (experience More ❯
Engineer Documentation & Operational Support Document technical procedures, automation workflows, and deployment steps Maintain logs and status reports on routine tasks, transfers, and system health Participate in incident response, including rootcauseanalysis and resolution tracking Required Skills, Education, and Experience: Bachelor's degree and 8+ years of experience OR Master's degree and 6+years of experience Must More ❯
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Tate Professional
storage, backups, and Linux systems using tools such as Ansible, Terraform, and GitHub. Collaborate with cross-functional teams to align infrastructure delivery with DevOps best practices. Lead incident response, rootcauseanalysis, and ongoing support for critical infrastructure services. Define and implement infrastructure administration standards and procedures. Champion Infrastructure as Code and continuous improvement across the hosting More ❯
Be proficient in Linux server and system administration (e.g., package management, kernel updates, filesystems, volume management) Have experience managing containerized workloads using Docker or Kubernetes Be an expert in RootCauseAnalysis Have a strong desire to learn new skills and technologies, with proven research capabilities and adaptability Possess at least two years of experience training and More ❯
links and/or collaborating with agency stakeholders. Assist in system design, development, and implementation. Install, configure and maintain hardware and software. Analyze, resolve issues, and determine/provide rootcauseanalysis with details on resolution/restoration, Provide cybersecurity support and documentation for information/operational technology and/or telecommunications systems to obtain favorable assessments More ❯
operational performance, and security compliance. Facilitate effective communication between IT teams and business units. Problem Solving and Incident Management: Manage and resolve high-priority incidents and critical issues. Conduct rootcauseanalysis and implement corrective actions to prevent recurrence. Develop and maintain incident response plans and procedures. Requirements: Proven experience as a Digital Operations Manager, IT Manager More ❯
and workflows. Develop and manage detailed program schedules and report weekly progress and risks to senior leadership teams across multiple sites and functions. Participate in team meetings for metric analysis and daily/weekly goals achievement. Contribute to the development of other team members. Confident working in a Linux and CLI environment, and working with pre-written network configurations. … OSI model - Previous experience working in a Data Center environment and understanding of Linux/Unix Administration - Experience in large scale DC or equivalently complex troubleshooting and maintenance practices - Rootcauseanalysis and troubleshooting/problem solving experience - Ability to understand & communicate high level technical solutions - Data Center, System Engineering, or Networking engineering background and demonstrated excellence. More ❯
VDI implementations in an enterprise setting. Proficiency in PowerShell scripting and automation workflows. Deep understanding of Azure networking, identity, and storage components. Strong troubleshooting skills and ability to lead rootcauseanalysis efforts. Excellent communication, documentation, and stakeholder engagement skills. Bachelor's degree in Computer Science or related field, or equivalent experience. Preferred Qualifications Microsoft certifications such More ❯
Waterwells Business Park, Quedgeley, Gloucester, Gloucestershire, England, United Kingdom Hybrid / WFH Options
IMT Resourcing Solutions
and infrastructure (RMS, mobile and CAD platforms). Key Responsibilities Validate, cleanse and enrich large, operational datasets; fix anomalies before they hit production. Profile data, uncover patterns and perform root-causeanalysis using T-SQL and BI/visualisation tools. Own data-quality KPIs (completeness, accuracy, timeliness) and present clear insights to stakeholders. Maintain data dictionaries, quality More ❯
stakeholders to define, implement, and support data flow and routing logic Translate business requirements into technical solutions with a strong focus on performance, reliability, and maintainability Perform troubleshooting and rootcauseanalysis across complex systems in high-pressure trading environments Support deployment, monitoring, and validation of production systems in an Agile/DevOps workflow Gain a deep More ❯
not limited to Cisco Routing, Switching, Security, SDN, Unified Communications and Wireless technologies). Identify and explore opportunities for enhancing efficiency, leveraging orchestration technologies to streamline and automate. Lead 'RootCauseAnalysis' investigations into network faults, security and performance issues. Support the Principal NetOps Engineer and Architects with project implementation. Liaise with third party service providers for More ❯
Salisbury, Wiltshire, United Kingdom Hybrid / WFH Options
Sopra Steria Group
not limited to Cisco Routing, Switching, Security, SDN, Unified Communications and Wireless technologies). Identify and explore opportunities for enhancing efficiency, leveraging orchestration technologies to streamline and automate. Lead 'RootCauseAnalysis' investigations into network faults, security and performance issues. Support the Principal NetOps Engineer and Architects with project implementation. Liaise with third party service providers for More ❯
small - solutions living within the Northrop Grumman Microelectronics Center (NGMC). Boasting state-of-the-art design capabilities, multiple processing nodes, electrical testing, environmental and QCI screening, and failure analysis, the NGMC is a leader in designing, fabricating, packaging, and delivering discriminating microelectronics to the military, aerospace, and commercial markets. For more than 70 years, we have been offering … test workflows across various environments Deploy and maintain robust monitoring, alerting, and observability tools (e.g. Prometheus, Grafana, ELK) to enhance performance, reliability, and visibility Automate incident management processes, including rootcauseanalysis and self-healing mechanisms, to improve platform stability Ensure compliance with security best practices throughout the platform and processes Collaborate with development, Information Assurance, and More ❯
leaders both within Leidos and CBP. Assist in change management of container platform for new versions of OpenShift, Hotfixes, SysAdmin tasks, etc. Works with experienced team members to conduct rootcauseanalysis of issues, review new and existing code and/or perform unit testing. Authors/edits system documentation/playbook(s) Partners with experienced team More ❯
and non-technical staff across numerous areas. * Proven ability to work independently on multiple tasks with commitment and willingness to see issues through to resolution * Excellent problem solving and RootCauseAnalysis skills * Proficiency in understanding, analysing and defining corrective actions any tickets raised by users * Understanding of virtualization and environments ability to understand Intune administration * Knowledge More ❯
and build environments using Infrastructure as Code with Terraform and configuration management tools like Ansible. Automate repetitive tasks to eliminate toil and drive consistency and repeatability. Incident response and root-causeanalysis; support a blameless post-mortems culture. Required Skills and Qualifications Active TS/SCI security clearance (TS/SCI with Poly is preferred). Bachelor More ❯
Chantilly, Virginia, United States Hybrid / WFH Options
Edgesource
like Grafana and Prometheus. Ensure comprehensive monitoring, logging, and alerting for all services. Reliability and Performance: Ensure high availability and performance of services. Conduct capacity planning, performance tuning, and rootcauseanalysis for incidents. Implement and maintain service level objectives (SLOs) and service level indicators (SLIs). Operational Excellence: Develop and enforce best practices for incident management More ❯