GPU Chief Architect
We are seeking a highly experienced GPU architect to lead the definition and execution of next-generation mobile GPU architecture, while driving architectural convergence between GPU and NPU toward a coherent xPU sub-system design.
This role requires deep expertise in GPU microarchitecture, strong system-level architectural capability, including both hardware and software, and a thorough understanding in graphics and AI common workload. A proven track record of delivering related sub-system IP or complex SoC silicon is highly desirable.
The successful candidate will lead the effort in shaping a converged xPU architecture native for future AI compute, optimised for performance, power efficiency, and silicon area in the next generation mobile compute platforms.
Key Responsibilities:
xPU Converged Architecture Design
- Based on 1st order principle, analyse and characterise future mobile graphics and AI workload, redefine an xPU (GPU & NPU) converged architecture, including hardware and software, from the ground up that is optimal for future applications.
- Ensure compatibility or easy transition from the old architecture.
- Define unified or partially unified execution resources (vector, scalar, tensor units)
- Develop shared scheduling and workload dispatch mechanisms for graphics and AI
- Design resource sharing and isolation strategies under mixed workloads
- Evaluate architectural trade-offs between dedicated and converged compute blocks
- Mobile GPU Architecture Leadership
- Ensure the timely delivery of next-generation mobile GPU architecture and long-term roadmap
- Lead evolution of shader cores, execution pipelines, and cache hierarchy
- Drive performance, power efficiency (Perf/W), and area efficiency (Perf/mm2)
- Provide architectural leadership from concept phase through tape-out
- Memory & Interconnect Architecture
- Define a memory hierarchy strategy for converged GPU/NPU workloads
- The architect shared cache structures and bandwidth arbitration policies
- Optimise on-chip interconnect for heterogeneous compute traffic
- Reduce data movement overhead across compute domains
- System-Level Architecture Collaboration
- Collaborate with CPU, AI software, runtime, and system architecture teams
- Participate in SoC-level power, thermal, and floorplanning trade-offs
- Align hardware architecture with graphics APIs and AI frameworks
- Support performance modelling, workload characterisation, and silicon bring-up
Required:
- 15+ years of experience in GPU, AI accelerator, or heterogeneous compute architecture
- Deep understanding of GPU microarchitecture (SIMD/SIMT, scheduling, memory systems)
- Strong knowledge of tensor/matrix computation and AI acceleration techniques
- Proven experience delivering high-volume silicon
- Expertise in performance modelling and power analysis
- Strong cross-functional communication and leadership capability