software development and systems engineering. A high bar for code and configuration quality and readability. A good understanding of current observability and reliability practices. Experienced and comfortable in running incident response. Big picture thinking - you can make trade offs on technical work streams against business impact. Fantastic communication skills. You're able to articulate what you're working on More ❯
london (city of london), south east england, united kingdom
Duffel
software development and systems engineering. A high bar for code and configuration quality and readability. A good understanding of current observability and reliability practices. Experienced and comfortable in running incident response. Big picture thinking - you can make trade offs on technical work streams against business impact. Fantastic communication skills. You're able to articulate what you're working on More ❯
Focus Manage Nvidia GPU clusters and related infrastructure Implement failover, resilience, and resource optimization strategies Oversee capacity planning and workload scheduling Monitor performance using Nvidia and HPE tools Manage incidentresponse, node failures, and access/security controls Required Skills & Experience Strong understanding of L1/L2 processes and troubleshooting workflows Experience with cloud, APIs, and distributed systems More ❯
Focus Manage Nvidia GPU clusters and related infrastructure Implement failover, resilience, and resource optimization strategies Oversee capacity planning and workload scheduling Monitor performance using Nvidia and HPE tools Manage incidentresponse, node failures, and access/security controls Required Skills & Experience Strong understanding of L1/L2 processes and troubleshooting workflows Experience with cloud, APIs, and distributed systems More ❯
monitoring the platform security and integrate security tools into the S-SDLC. Work with the local DevSecOps team to improve our S-SDLC and take part in our security incidentresponse team Your Experience & Skills At least 3 years of experience in software engineering. At least 2 years of experience in application security. In-depth knowledge of application More ❯