Infrastructure Specialist - Linux, HPC, Infrastructure & Storage
Platform Infrastructure Specialist – Linux, HPC, Infrastructure & Storage
We are partnering with a leading organisation to hire skilled Infrastructure Platform Specialists to support their advanced IT infrastructure, HPC, and enterprise storage environments. This is an exciting opportunity for candidates with a strong technical background in server, storage, and high-performance computing environments.
Key Responsibilities:
- Install, configure, and maintain physical and virtual servers, including Dell and Nvidia platforms.
- Design, deploy, and manage enterprise storage (SAN/NAS) and backup environments, ensuring performance, scalability, and compliance.
- Support HPC and AI compute infrastructures, GPU clusters, and model training platforms.
- Monitor system health, performance metrics, and proactively resolve incidents to maintain uptime and meet SLA targets.
- Collaborate with cross-functional teams to deliver seamless operations and support strategic projects.
- Implement security best practices, patching, and disaster recovery processes.
- Document procedures, configurations, and incident resolutions for compliance and knowledge sharing.
- Lead troubleshooting efforts, root cause analysis, and operational improvements, providing guidance to junior engineers.
Required Skills & Experience:
- Strong technical expertise across server, storage, networking, and HPC infrastructure.
- Experience with Dell EMC, NetApp, HPE, Nvidia, and virtualization technologies.
- Linux and Windows system administration experience.
- Knowledge of backup, disaster recovery, and monitoring tools.
- Scripting and automation skills (Python, PowerShell, Bash, Ansible, Terraform).
- Experience with workload schedulers (e.g., SLURM, PBS) and AI/ML frameworks (TensorFlow, PyTorch) is a plus.
- Excellent analytical, troubleshooting, and collaboration skills.
Qualifications:
- Relevant certifications (ITIL, Dell EMC, NetApp, NVIDIA DLI, Veeam, or equivalent) are highly desirable.