look at all the evidence available and support the client on the appropriate action to contain and remediate any security incident. They will need to be able to provide rootcauseanalysis and liaise with the customer and the Service Delivery Manager as well and ensuring the actions of the SOC Analysts follow best practice. Security Monitoring … Monitoring SIEM tools to assure high a level of security operations delivery function Oversee and enhance security monitoring systems to detect and analyse potential security incidents. Conduct real-time analysis of security events and incident and escalate as necessary Support other teams on investigations into incidents, determining the rootcause and impact. Document findings and lessons learned … with the Technical Teams to ensure all new and changed services are monitored accordingly Documentation: Maintain accurate and up-to-date documentation of security procedures, incident response plans, and analysis reports. Create post-incident reports for management and stakeholders. Support the creation of monthly reporting packs as per contractual requirements. Create and document robust event and incident management processes More ❯
look at all the evidence available and support the client on the appropraite action to contain and remediate any security incident. They will need to be able to provide rootcauseanalysis and liaise with the custiomer and the Service Delivery Manager as well and ensuring the actions of the SOC Analysts follow best practice. Job Duties … Monitoring SIEM tools to assure high a level of security operations delivery function Oversee and enhance security monitoring systems to detect and analyse potential security incidents. Conduct real-time analysis of security events and incident and escalate as necessary Support other teams on investigations into incidents, determining the rootcause and impact. Document findings and lessons learned … with the Technical Teams to ensure all new and changed services are monitored accordingly Documentation: Maintain accurate and up-to-date documentation of security procedures, incident response plans, and analysis reports. Create post-incident reports for management and stakeholders. Support the creation of monthly reporting packs as per contractual requirements. Create and document robust event and incident management processes More ❯
access control (RBAC), and ensuring compliance with DoD standards. Assist in the automation of operational tasks using Infrastructure-as-Code tools like Terraform or Bicep. Participate in incident response, rootcauseanalysis, and post-incident reviews to improve system reliability. Provide helpdesk support by taking ownership of tickets in the Remedy ticketing solution, resolving issues, and managing More ❯
access control (RBAC), and ensuring compliance with DoD standards. Assist in the automation of operational tasks using Infrastructure-as-Code tools like Terraform or Bicep. Participate in incident response, rootcauseanalysis, and post-incident reviews to improve system reliability. Provide helpdesk support by taking ownership of tickets in the Remedy ticketing solution, resolving issues, and managing More ❯
Falls Church, Virginia, United States Hybrid / WFH Options
Epsilon Inc
teams to optimize data pipelines for AI/ML initiatives, automation, and productization Lead efforts to integrate security best practices, ensuring compliance with relevant regulations and standards Conduct performance analysis, capacity planning, and system tuning to maximize uptime and reliability Guide junior team members in troubleshooting techniques, documentation, and adherence to best practices Drive continuous improvement by reviewing existing … for secure system architecture Familiarity with data engineering concepts, including ETL/ELT pipelines, big data tools, and AI/ML workflows Ability to troubleshoot complex system issues, perform root-causeanalysis, and implement effective solutions Excellent communication, teamwork, and organizational skills, with a focus on innovation and continuous improvement One or more of the following certifications More ❯
application support strategies Key Responsibilities: Own Application Support Lifecycle: Ensure end-to-end support for critical business applications, meeting SLAs and availability targets. Incident & Problem Management: Lead resolution and rootcauseanalysis for all Retail application incidents, including major (P1/P2) issues. Escalation & Crisis Leadership: Act as the escalation point for major incidents and provide direction … containerization experience with Azure , Docker , and AKS . Familiarity with modern web technologies, including React , REST APIs , and SOAP architectures. Skilled in managing P1/P2 incidents , business impact analysis, rootcause investigations, and change coordination. Strong grasp of IT service management practices; ITIL v4 certification or equivalent preferred. Proactive Monitoring : Hands-on experience with tools like More ❯
to the overall success of the FX desk's technology platform. * Respond rapidly to production incidents using data-driven decision making to minimise downtime and financial impact while leading rootcauseanalysis and conducting blameless post-mortems.* Enhance application health monitoring by implementing robust observability solutions and automating manual processes to improve system resilience.* Drive cost optimisation More ❯
Security Analyst, you will Monitor and analyze security alerts and events using SIEM and other security tools to identify and respond to threats. Lead investigations of security incidents, perform rootcauseanalysis, and coordinate response and remediation efforts. Collaborate with IT and business units to implement and improve security controls and ensure compliance with internal policies and … company match, Tuition Reimbursement, and Mileage Reimbursement Annual bonus based on performance and eligibility Requirements: Bachelor's degree in related field (e.g., Computer Science, Computer Engineering Information Technology, System Analysis, etc.) or equivalent combination of education and work experience. Typically, 3+ years of experience in IT/network security/cybersecurity. Strong understanding of security principles, threat landscapes, and More ❯
Proactively identify areas for improvement and implement preventive measures. Service Improvement: Continuously assess the IT service delivery process and implement improvements that enhance efficiency, effectiveness, and customer satisfaction. Lead rootcauseanalysis for service delivery issues and define corrective actions. Change Management: Ensure that changes to the IT environment are implemented smoothly with minimal disruption to service. More ❯
for improvement and minimizing the wastage Encouraging and building automated processes wherever possible Identifying and deploying security measures by continuously performing vulnerability assessment and risk management Incident management and rootcauseanalysis Coordination and communication with team and with customers both external and internal Selecting and deploying appropriate CI/CD tools Managing periodic reporting on the More ❯
Falls Church, Virginia, United States Hybrid / WFH Options
Epsilon Inc
of data between systems by helping with Extract, Transform, Load (ETL) processes and ensuring data consistency across different platforms. Monitor and Troubleshoot Database Performance Issues - Identify potential bottlenecks, perform rootcauseanalysis, and work with senior architects to implement solutions that enhance database reliability and efficiency. Support Compliance and Regulatory Requirements - Ensure database structures and data management More ❯
Falls Church, Virginia, United States Hybrid / WFH Options
Epsilon Inc
assessments and provide actionable recommendations for mitigation. Experience supporting security for data pipelines, AI/ML environments, or cloud-based infrastructures. Excellent incident response skills, including triage, containment, and rootcause analysis. Strong communication and collaboration abilities to partner with cross-functional teams and stakeholders. One or more of the following certifications are desired: Certified Cloud Security Professional More ❯
for improvement and minimizing the wastage • Encouraging and building automated processes wherever possible • Identifying and deploying security measures by continuously performing vulnerability assessment and risk management • Incident management and rootcauseanalysis • Coordination and communication with team and with customers both external and internal • Selecting and deploying appropriate CI/CD tools • Managing periodic reporting on the More ❯
best practices, cloud strategies, and platform engineering. Team Leadership: Guide and coach, a team of engineers, technical specialists, and architects, encouraging the adoption of innovative technologies and practices. Technical Analysis:Lead technical analysis and estimation efforts for custom-built applications. Best Practices:Drive the adoption of release management and automation best practices. Incident Management:Ensure thorough rootcauseanalysis and prompt remediation during any incidents or outages. Vendor Coordination:Work with external vendors to supplement team capacity and expertise when necessary. YOU'RE GOOD AT You bring solid development and program leadership experience to drive technical governance, innovation, integrations, and cloud strategies using emerging technologies like Gen AI. You thrive in environments that demand More ❯
Unix engineers and administrators in managing enterprise Unix systems, ensuring high availability, performance, and security. Oversee the configuration, deployment, and lifecycle management of Unix-based systems. Manage incident response, rootcauseanalysis, and resolution for Unix-related issues. Develop and manage budgets, forecasts, and vendor relationships related to Unix infrastructure, including license/subscription management of all More ❯
scalable, resilient platforms that support long-term growth. • Capacity Planning & Service Quality: Own service performance metrics and embed proactive capacity planning across infrastructure and services. • Proactive Issue Resolution: Lead root-causeanalysis, implement preventive controls, and champion continuous service improvement. • Service Management Governance: Oversee ITIL processes and support internal audits with robust systems and policies. • Incident & Change More ❯
scalable, resilient platforms that support long-term growth. • Capacity Planning & Service Quality: Own service performance metrics and embed proactive capacity planning across infrastructure and services. • Proactive Issue Resolution: Lead root-causeanalysis, implement preventive controls, and champion continuous service improvement. • Service Management Governance: Oversee ITIL processes and support internal audits with robust systems and policies. • Incident & Change More ❯
deployment, monitoring, and scaling. • Continuously evaluate and improve the cloud infrastructure to align with evolving technology trends and business requirements. • Respond to and resolve cloud-related incidents, providing detailed rootcauseanalysis and long-term solutions. • Work with other teams to ensure robust disaster recovery and business continuity planning. • Stay current with emerging cloud technologies and propose More ❯
base articles. Monitor application health using tools and custom dashboards. Support integration and communication between cloud platforms (Azure, Entra ID, Microsoft 365). Contribute to service improvement initiatives, including rootcauseanalysis and automation opportunities. Participate in on-call rotations or after-hours incidents during peak retail periods. Work within established security frameworks and governance. Hybrid working More ❯
with innovative approaches, and proactively identify opportunities for process and system improvements. Keep abreast of emerging technologies and industry trends. Oversee change management and incident response activities , including performing root-causeanalysis investigations and bug fixes as required . Lead and mentor team members by providing coaching, training, performance evaluations, and fostering a culture of accountability, responsibility More ❯
business - Build datasets, metrics, and KPIs supporting business - Design and develop highly available dashboards and metrics using SQL and Excel/Quicksight or other BI reporting tools - Perform business analysis and data queries using scripting languages like R, Python etc - Design, implement and support end-to-end analytical solutions that are highly available, reliable, secure, and scale economically - Collaborate … cross-functionally to recognize and help adopt best practices in reporting and analysis, data integrity, test design, analysis, validation, and documentation - Proactively identify problems and opportunities and perform rootcauseanalysis/diagnosis leading to significant business impact - Work closely with internal stakeholders such as Operations, Program Managers, Workforce, Capacity planning, machine learning, finance teams … Excel - 5+ years using data visualization tools like Tableau, Quicksight or similar tools - Experience with R, Python or other statistical/machine learning tools - Experience demonstrating problem solving and rootcauseanalysis - Experience using databases with a large-scale data set - Bachelor's degree in engineering, analytics, mathematics, statistics or a related technical or quantitative field - Detail More ❯
cloud and hybrid environments. Architect observability solutions (monitoring, logging, alerting) that detect and prevent failures before they impact users. Own and improve incident response workflows, including runbooks, communications, and rootcause analysis. Define and enforce SLIs, SLOs, and error budgets to balance innovation with operational stability. Mentor engineers and advise teams on best practices for scalability, security, deployment … efforts, reliability reviews, and cross-functional reliability programs. Core Responsibilities Operations Leadership Act as a senior escalation point for major incidents and production outages. Lead post-incident reviews, coordinate rootcauseanalysis, and drive remediation plans. Communicate platform health, risk, and improvement plans with technical and non-technical stakeholders. Design and build robust CI/CD workflows More ❯
Job Summary: As a Security Analyst, you will provide day-to-day security monitoring, incident response, and threat analysis leveraging Splunk Enterprise Security (ES) and SOAR platforms. You will also play an active role in the ongoing buildout, configuration, and engineering of our Splunk ES environment, including onboarding new data sources, creating detection content, and developing automated response workflows. … fast-paced government setting. Key Responsibilities: • Monitor and analyze security events using Splunk Enterprise Security (ES) dashboards, alerts, and correlation searches. • Investigate and respond to security incidents, including triage, rootcauseanalysis, containment, and remediation support. • Develop and fine-tune correlation rules, alerts, and dashboards in Splunk ES to improve threat detection capabilities. • Design, build, and maintain … onboarding new data sources, tuning correlation rules, and developing new detection use cases. • Collaborate with other teams to support incident response, vulnerability management, and threat hunting activities. • Conduct threat analysis, log analysis, and data enrichment using Splunk and other security tools. • Participate in regular security reviews and audits, providing evidence and reporting as needed. • Contribute to documentation and More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
BAE Systems (New)
hybrid and flexible working arrangements available. Please consult your recruiter for details. Grade: GG10 - GG11 Referral Bonus: £5,000 Job Description Serve as the point of escalation for intrusion analysis, forensics, and incident response queries. Provide rootcauseanalysis for complex, non-standard findings and anomalies without existing playbooks. Mentor team members and share knowledge proactively. … red team and pentest findings to improve detection rules. Provide forensic support and threat emulation to improve alert triage and accuracy. Identify gaps in SOC processes, data collection, and analysis, demonstrating the need for improvements through scenarios and red teaming. Perform complex threat hunting, automation, and analytic enrichment tasks. Set vision and milestones for emulation and detection capabilities, influencing More ❯
testing. Work closely with development teams to integrate testing into the software development lifecycle (SDLC). Identify, document, and track defects using issue-tracking tools such as JIRA. Conduct rootcauseanalysis and provide insights to improve product quality. Collaborate with cross-functional teams to ensure adherence to quality standards and best practices. Mentor and guide junior More ❯