Experience with Linux virtualization, networking or graphics stacks Experience with one or more of the follow Experience with Docker/OCI containers/K8s ing technologies: confidential computing, RDMA, Infiniband and high performance computing. Performance engineering, benchmarking and profiling What We Offer You We consider geographical location, experience, and performance in shaping compensation worldwide. We revisit compensation annually (and more More ❯
PyTorch, or Hugging Face Transformers. Good understanding of programming/scripting: (e.g., Python, Go) for customizing solutions, creating scripts, or automating tasks. Experience with AI relevant infrastructure, including Networking (InfiniBand and RoCE), Storage (FC, IP and scale out) and AI accelerators (GPUs etc). Excellent presentation skills - ability to value-sell and deliver engaging workshops to both technical and non More ❯
Staff Software Engineer, AI Reliability Engineering London, UK About Anthropic Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our More ❯
CUTLASS, CUB, Thrust, cuDNN and cuBLAS Intuition about the latency and throughput characteristics of CUDA graph launch, tensor core arithmetic, warp-level synchronization and asynchronous memory loads Background in Infiniband, RoCE, GPUDirect, PXN, rail optimisation and NVLink, and how to use these networking technologies to link up GPU clusters An understanding of the collective algorithms supporting distributed GPU training in More ❯