configuration tools such as Terraform, Puppet, and PowerShell. Knowledge of identity/access management solutions and best practices for authentication and authorization. Familiarity with infrastructure-based processes like monitoring, capacityplanning, facilities management, performance tuning, asset management, disaster recovery, and data center support. Ability to quickly learn new technologies using documentation and online resources. Experience with AKS/ More ❯
/800-53, RMF, and DFARS . • Conduct compliance audits, risk assessments, and assist in developing System Security Plans (SSPs) and POA&Ms. • Monitor system performance, perform capacityplanning, and implement backup/recovery processes. • Provide Tier 1 support and serve as an escalation point for complex technical issues unresolved by helpdesk staff. • Participate in security configuration baselines More ❯
IMMIGRATION PETITIONS AT THIS TIME. IT NETWORK SYSTEMS ADMINISTRATOR RESPONSIBILITIES: Network Plan, design, and implement the organization's network Infra including but not limited to LAN, WLAN, and WAN. Capacityplanning to ensure system/network scalability and reliability. Support, configure, maintain, and upgrade servers, shared drives, etc. Evaluate and recommend hardware and software solutions to meet the More ❯
with the goal of automating response to all non-exceptional service conditions. Influence and create new designs, architectures, standards, and methods for large-scale distributed systems. Engage in service capacityplanning, service integration and geo-expansion, software performance analysis and system tuning. Candidate must be solutions-oriented using rigorous logic and methods to solve difficult problems with effective More ❯
London, England, United Kingdom Hybrid / WFH Options
Warner Bros. Discovery
with the goal of automating response to all non-exceptional service conditions. Influence and create new designs, architectures, standards, and methods for large-scale distributed systems. Engage in service capacityplanning, service integration and geo-expansion, software performance analysis and system tuning. Candidate must be solutions-oriented oriented using rigorous logic and methods to solve difficult problems with More ❯
with LAN and WAN deployments. Demonstrated experience with Enterprise Backup Solutions such as Microsoft DPM, and ArcServe Backup Suite. Demonstrated experience with Storage Area Networks (SAN), including design, provisioning, capacityplanning, and maintenance. Experience with Hyper-V virtual machines. Demonstrated experience with network design (LAN & WAN), including backup systems and disaster recovery architectures (DR). Demonstrated experience evaluating More ❯
You excel at troubleshooting under pressure and can systematically get to the root of complex problems. Equally, you’re proactive – you anticipate future needs and improvements, whether it’s capacityplanning for growth or shoring up a security gap, and take initiative to address these Highly motivated and possessing an immense sense of pride in your work; you More ❯
business goals and resource capacity. Operational Excellence Champion initiatives that enhance system availability, scalability, and performance. Collaborate with the Global SRE Leader to refine and enforce operational policies (e.g., CapacityPlanning, Change Management, Disaster Recovery). Cross-Functional Collaboration Partner with Software Engineering, Infrastructure, Operations, Security, and Business teams to deliver secure and reliable platforms. Team Development Build … and compliance reporting. Automation & Efficiency Lead automation initiatives to streamline workflows and increase uptime. Use Jira to manage tasks and projects, and align global SRE practices for seamless support. CapacityPlanning Drive timely capacityplanning to prevent last-minute issues. Support budget planning to align infrastructure investments with growth and performance targets. Participate in quarterly … capacity reviews and follow up on outcomes. Monitoring & Analytics Oversee the implementation of monitoring and alerting systems to detect and resolve issues proactively—before customer or compliance impacts occur. Qualifications: Bachelor’s degree in Computer Science, Engineering, or related field (Master’s ) 7+ years in a technical SRE, DevOps Position 2+ years in a leadership or senior engineering capacityMore ❯
lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacityplanning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps … with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to develop and execute technical roadmaps aligned with business objectives and scalability requirements Problem-Solving Skills: Track record of solving complex technical challenges and implementing sustainable solutions Communication … in communicating technical concepts to both technical and non-technical stakeholders Automation Expertise: Strong background in infrastructure automation, CI/CD pipelines, and DevOps practices Risk Management: Experience in capacityplanning, disaster recovery, and building resilient systems Cross-functional Collaboration: Proven ability to work effectively with product, development, and business teams Change Management: Experience in managing organizational change More ❯
lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacityplanning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps … with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to develop and execute technical roadmaps aligned with business objectives and scalability requirements Problem-Solving Skills: Track record of solving complex technical challenges and implementing sustainable solutions Communication … in communicating technical concepts to both technical and non-technical stakeholders Automation Expertise: Strong background in infrastructure automation, CI/CD pipelines, and DevOps practices Risk Management: Experience in capacityplanning, disaster recovery, and building resilient systems Cross-functional Collaboration: Proven ability to work effectively with product, development, and business teams Change Management: Experience in managing organizational change More ❯
lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacityplanning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps … with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to develop and execute technical roadmaps aligned with business objectives and scalability requirements Problem-Solving Skills: Track record of solving complex technical challenges and implementing sustainable solutions Communication … in communicating technical concepts to both technical and non-technical stakeholders Automation Expertise: Strong background in infrastructure automation, CI/CD pipelines, and DevOps practices Risk Management: Experience in capacityplanning, disaster recovery, and building resilient systems Cross-functional Collaboration: Proven ability to work effectively with product, development, and business teams Change Management: Experience in managing organizational change More ❯
and timely data. A strong candidate will have deep hands-on experience withPower BI, includingPower Query,DAX,tabular modelling, and efficient data transformations. They will be experienced atPower BI capacityplanning, optimizing performance, and integrating Power BI solutions with broader data platforms. The Senior Reporting Developer will be creating reports which would reflect the business needs and will … reliable data pipelines feeding into Power BI from multiple data sources. Implement data integration best practices to ensure consistent, accurate, and secure data across the reporting ecosystem. Infrastructure, Automation & CapacityPlanning Implement and manage Power BI Workspaces, gateways, and deployment pipelines. Apply CI/CD practices for reporting artifacts (e.g., version control, automated testing, and release management). … Identify and implement automation opportunities for processes such as refresh schedules, performance monitoring, and usage analytics. Contribute to capacityplanning and management, ensuring optimal resource utilization and ability to handle growth in data volume and user concurrency. Collaboration, Engagement & Design Alignment Work closely with business stakeholders, product managers, and data engineering/platform teams to gather reporting requirements More ❯
lead major incident reviews, and drive systematic improvements Automation Development: Spearhead automation initiatives to reduce manual operations and improve system reliability Performance Optimization: Lead projects to optimize system performance, capacityplanning, and cost efficiency Cross-team Collaboration: Work closely with development teams to implement SRE best practices and drive operational excellence Technical Strategy: Develop and execute technical roadmaps … with expertise in incident management and post-mortem processes Team Development: Experience in hiring, mentoring, and growing high-performing technical teams while fostering a culture of continuous learning Strategic Planning: Ability to develop and execute technical roadmaps aligned with business objectives and scalability requirements Problem-Solving Skills: Track record of solving complex technical challenges and implementing sustainable solutions Communication … in communicating technical concepts to both technical and non-technical stakeholders Automation Expertise: Strong background in infrastructure automation, CI/CD pipelines, and DevOps practices Risk Management: Experience in capacityplanning, disaster recovery, and building resilient systems Cross-functional Collaboration: Proven ability to work effectively with product, development, and business teams Change Management: Experience in managing organizational change More ❯
infrastructure, including proactive monitoring and maintenance Work with InfoSec and the Security Engineers to ensure consistent and robust network and firewall security Monitor network performance, identify trends, and provide capacityplanning and resource management insights Provide network expertise to internal and externals projects, engaging with stakeholders as required Assist in troubleshooting and resolving network issues, implementing preventative measures … staff in wider governance groups Ability to make technical decisions by managing levels of risk and complexity, recommending decisions as and when risk and complexity changes or increases Strong planning and organisational skills, including the ability to coordinate several work streams simultaneously, while balancing priorities and quality Excellent communication skills with a capacity to present, discuss and explain More ❯
protected and only accessible by the engineer with the required skillset. Identifying and mitigating network vulnerabilities. Ensure security patches/firmware are tested and applied to maintain system security. CapacityPlanning and Optimisation - Assessing network capacity and planning for future growth. Optimising network performance by analysing traffic patterns and making all necessary adjustments. Implementing network load More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
In Technology Group Limited
service availability Act on infrastructure alerts and monitoring tools to resolve issues efficiently Deliver enhancements to IT services via BAU, project workstreams, and internal initiatives Maintain and forecast infrastructure capacity and performance Perform regular housekeeping, patching, and system upgrades Core Technical Requirements (Essential): Strong experience supporting and maintaining Red Hat Linux (RHEL) environments Proven ability to perform in-place … Amazon Linux Imaging (AMI) Exposure to Windows Server environments (2016, 2019, 2022), including Active Directory, Group Policy, DNS, DHCP Experience managing VMware datastores and LUNs , including performance tuning and capacityplanning Knowledge of vRealize Operations, Log Insight, and Network Insight tools Interested? If you're a Senior Infrastructure Engineer with hands-on Red Hat and VMware experience - and More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Lane Clark and Peacock LLP
infrastructure, including proactive monitoring and maintenance Work with InfoSec and the Security Engineers to ensure consistent and robust network and firewall security Monitor network performance, identify trends, and provide capacityplanning and resource management insights Provide network expertise to internal and externals projects, engaging with stakeholders as required Assist in troubleshooting and resolving network issues, implementing preventative measures … staff in wider governance groups Ability to make technical decisions by managing levels of risk and complexity, recommending decisions as and when risk and complexity changes or increases Strong planning and organisational skills, including the ability to coordinate several work streams simultaneously, while balancing priorities and quality Excellent communication skills with a capacity to present, discuss and explain More ❯
and develop procedures and working practices for the efficient and effective running of all tasks associated with operating and controlling the Cloud infrastructure. Ensure that robust availability monitoring and capacityplanning procedures are in place to ensure the resilience of operational services. Work with the wider IT team and managed service provider on the delivery of IT projects … review/approve CRs as required. Act as escalation point for IT operations issues and major incidents. Ensure incidents and service requests are managed to resolution. Assist in the planning and implementation of the infrastructure architecture including design, migration, integration and installation. Disaster Recovery Act as primary point of contact and escalation for DR activities including out of hours … DR testing. Maintain the global IT Service Continuity and Disaster Recovery (ITSC/DR) framework, including governance, plans, policies, processes, procedures, standards and strategies. Work with MSP and oversee planning and testing of IT DR/backup plans and co-ordinate testing activities with business users. Validate DR testing activities and results, sign off and communicate to senior business More ❯
infrastructure, including proactive monitoring and maintenance Work with InfoSec and the Security Engineers to ensure consistent and robust network and firewall security Monitor network performance, identify trends, and provide capacityplanning and resource management insights Provide network expertise to internal and externals projects, engaging with stakeholders as required Assist in troubleshooting and resolving network issues, implementing preventative measures … staff in wider governance groups Ability to make technical decisions by managing levels of risk and complexity, recommending decisions as and when risk and complexity changes or increases Strong planning and organisational skills, including the ability to coordinate several work streams simultaneously, while balancing priorities and quality Excellent communication skills with a capacity to present, discuss and explain More ❯
IT systems and infrastructures are reliable, scalable, and secure. Key Responsibilities Leadership Environment Management: Deployment & Automation: Performance & Scalability: Security & Compliance: Collaboration & Stakeholder Management: Documentation & Reporting: Incident Management & Problem Resolution: CapacityPlanning: Escalate issues as appropriate. Manage assigned risks and issues. Adhere to change, project, and analysis standards Skills, Knowledge & Abilities Experience: At least 5-7 years of experience More ❯
IT systems and infrastructures are reliable, scalable, and secure. Key Responsibilities Leadership Environment Management: Deployment & Automation: Performance & Scalability: Security & Compliance: Collaboration & Stakeholder Management: Documentation & Reporting: Incident Management & Problem Resolution: CapacityPlanning: Escalate issues as appropriate. Manage assigned risks and issues. Adhere to change, project, and analysis standards Skills, Knowledge & Abilities Experience: At least 5-7 years of experience More ❯
maintain file system security, access controls, and auditing. Collaborate with network, security, and application teams to ensure seamless integration. Experience and knowledge on Windows File share & Managed folders. Perform capacityplanning, performance tuning, and disaster recovery testing. Document system configurations, procedures, and best practices. Stay updated with emerging technologies and recommend improvements. Required Qualifications: Bachelor's degree in More ❯
Fairfax, Virginia, United States Hybrid / WFH Options
M.C. Dean
monitoring, and optimization of multiple, geographically separated data center and the installed assets, power, cooling, and space utilization. This role requires a strong understanding of DCIM tools, data analytics, capacityplanning, and operational best practices to maximize uptime, improve efficiency, and support business continuity. SQL: Working knowledge of Structured Querry Language Modify existing SQL reports, create new SQL … validate all information is accurate and consistent Floor Plan and Location Management Integrate and maintain data center master floor plans for each covered data center. Research into existing floor planning documentation Coordination with facility floor managers regarding existing zoning Review of DISA mechanical & electrical projects to determine planned/assumed zoning of infrastructure support equipment. Coordination with site TIM … Institute Tier classifications. Excellent communication skills with the ability to collaborate across multidisciplinary teams. Understanding of government and DoD security policies related to user account management. Experience with floor planning and asset modeling in data center environments. Project management experience, with the ability to lead DCIM-related initiatives and process improvements. Strong understanding of compliance and regulatory requirements related More ❯
on-call on a rotation for call-in support and after-hours support for system upgrades. Duties: • Provides technical expertise in the areas of system design, installation, configuration, tuning, capacityplanning, troubleshooting, and problem resolution • Responds to Incidents and Changes submitted by customers, to include Tier 2 level triage • Responds to alerts and customer reported issues, escalates problems More ❯
activities. Conduct regular Disaster Recovery testing, validating Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO), and ensuring up-to-date DR procedures. Perform system maintenance, upgrades, performance monitoring, capacityplanning, and compute/storage management with minimal disruption. Evaluate and implement patches, service packs, and upgrades, ensuring all dependencies and approvals are addressed. Maintain comprehensive documentation and More ❯