with ML engineers to implement GPU-level optimizations for ML model training and inference, focusing on speed and efficiency. Profile and optimize ML workloads running on GPUs, focusing on memorymanagement, parallelization, and performance tuning. Develop and optimize custom GPU drivers and frameworks for ML-specific tasks, including model training, AI inference, and data preprocessing. Collaborate with data … to date with the latest GPU architecture and machine learning advancements, applying new techniques to optimize system performance. Skills and Experience: Proficiency in C++ with a strong focus on memorymanagement, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms More ❯
with ML engineers to implement GPU-level optimizations for ML model training and inference, focusing on speed and efficiency. Profile and optimize ML workloads running on GPUs, focusing on memorymanagement, parallelization, and performance tuning. Develop and optimize custom GPU drivers and frameworks for ML-specific tasks, including model training, AI inference, and data preprocessing. Collaborate with data … to date with the latest GPU architecture and machine learning advancements, applying new techniques to optimize system performance. Skills and Experience: Proficiency in C++ with a strong focus on memorymanagement, multi-threading, and low-level performance optimizations. Experience with GPU architectures (e.g., NVIDIA, AMD) and programming frameworks like CUDA, OpenCL, and TensorFlow. Understanding of machine learning algorithms More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Techfellow Limited
You Bring... 4-8 years' experience managing large-scale Linux infrastructure in high-performance, distributed, or AI-centric environments Deep technical fluency with GPU architecture, deployment, and tuning (e.g. memorymanagement, driver compatibility, hardware diagnostics) Strong scripting and automation skills, especially in Python, with infrastructure-as-code mindset Hands-on experience resolving GPU workload issues across compute clusters … and supporting technologies Familiarity with performance tooling and debugging in live production environments Practical experience with CUDA or systems-level programming in C/C++ Experience with config management frameworks like Salt, Ansible, or Puppet (Preferred) Experience with GPU communication and interconnect technologies (e.g. collective communication libraries such as NCCL, low-latency solutions like GPUDirect RDMA, or high-speed More ❯
You Bring... 4-8 years' experience managing large-scale Linux infrastructure in high-performance, distributed, or AI-centric environments Deep technical fluency with GPU architecture, deployment, and tuning (e.g. memorymanagement, driver compatibility, hardware diagnostics) Strong scripting and automation skills, especially in Python, with infrastructure-as-code mindset Hands-on experience resolving GPU workload issues across compute clusters … and supporting technologies Familiarity with performance tooling and debugging in live production environments Practical experience with CUDA or systems-level programming in C/C++ Experience with config management frameworks like Salt, Ansible, or Puppet (Preferred) Experience with GPU communication and interconnect technologies (e.g. collective communication libraries such as NCCL, low-latency solutions like GPUDirect RDMA, or high-speed More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Techfellow Limited
You Bring... 4-8 years' experience managing large-scale Linux infrastructure in high-performance, distributed, or AI-centric environments Deep technical fluency with GPU architecture, deployment, and tuning (e.g. memorymanagement, driver compatibility, hardware diagnostics) Strong scripting and automation skills, especially in Python, with infrastructure-as-code mindset Hands-on experience resolving GPU workload issues across compute clusters … and supporting technologies Familiarity with performance tooling and debugging in live production environments Practical experience with CUDA or systems-level programming in C/C++ Experience with config management frameworks like Salt, Ansible, or Puppet (Preferred) Experience with GPU communication and interconnect technologies (e.g. collective communication libraries such as NCCL, low-latency solutions like GPUDirect RDMA, or high-speed More ❯
/O, PCIe/DMA interactions, and high-speed protocols (e.g., SFPDP). Proficiency in C# for tooling or Windows-based test interfaces. Deep understanding of software design principles, memorymanagement, and debugging hardware-software interactions. Desirable Skills Experience with SFPDP in defence, aerospace, or data acquisition projects. Familiarity with FPGA-based data systems and hardware-in-the More ❯
to think through client needs and incorporate end-user feedback Strong UI/UX instincts and an eye for visual design Understanding of client-side performance, including rendering optimizations, memorymanagement, and state management Positive attitude, sense of humor and creativity Strong analytical, project leadership and communication skills Team leadership and management skills You should have … a strong interest in web-based software development and additional experience in the financial services technology/asset management space would be a bonus. The process: Meet with our CTO, to have a quick discussion and hear about you, and talk about our story, and where we're heading Complete a coding test and discuss it with a member More ❯
you by Jobs/Redefined, the UK's leading over-50s age inclusive jobs board. Company: Qualcomm Technologies International Ltd Job Area: Engineering Services Group, Engineering Services Group > Program Management General Summary: This Software Program Manager role focuses on Server Software teams, managing the planning, development, and delivery of software for Qualcomm's Server Business Unit. You will develop … business outcomes and benefits tracking. Collaborate with key stakeholders and program sponsors to develop program goals, set the prioritization of deliverables, discuss involvement of business processes (e.g., program change management, communication), and drive decisions necessary for on-time delivery. Manage program health and execution: Strong technical understanding of SW deliverables and risk management/risk mitigation. Establish rigorous … execution discipline & communication process: risk management, mitigation, tracking, schedule trends vs baseline, recovery actions, executive reporting & stakeholder communication. Develop program indicators to manage program health including quality and timelines. Promote program vision and objectives within the team, ensure program objectives are met or exceeded, present program vision to management, and gain buy-in from stakeholders. Additional responsibilities: Manage More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Annapurna
real-time decision-making in autonomous driving. What to Expect The successful candidate will focus on host-side software and hardware interactions to ensure optimal data transfer and resource management for efficient AI inference on GPUs. Key responsibilities include Developing and optimizing C++ code for efficient data transfer and latency management between the host and GPUs across diverse … vendor platforms. Working with low-level system and memorymanagement techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and resolving system bottlenecks. Experience in system-level programming, including memorymanagement, data alignment More ❯
real-time decision-making in autonomous driving. What to Expect The successful candidate will focus on host-side software and hardware interactions to ensure optimal data transfer and resource management for efficient AI inference on GPUs. Key responsibilities include Developing and optimizing C++ code for efficient data transfer and latency management between the host and GPUs across diverse … vendor platforms. Working with low-level system and memorymanagement techniques to minimize latency and improve real-time inference performance. Utilizing and implementing GPU programming APIs (e.g., CUDA, OpenCL) to ensure high efficiency and compatibility across GPUs. Profiling and debugging system performance using tools like NVIDIA Nsight, Intel VTune, and vendor-specific profilers, identifying bottlenecks and implementing effective … modern C++ standards. Proven experience in GPU programming and optimization, with proficiency in CUDA, OpenCL, or other GPU programming frameworks. Strong knowledge of parallel computing concepts, including data locality, memory access patterns, and synchronization. Proficiency with performance profiling tools and techniques for identifying and resolving system bottlenecks. Experience in system-level programming, including memorymanagement, data alignment More ❯
end performance challenges in AI inference stacks. Tech Stack & Focus Rust-first development (performance-critical systems, type-safety, modern tooling). Low-level systems programming (CPU/accelerator interaction, memorymanagement). Compiler design, functional programming concepts, and hardware/software co-design. We’re Excited If You Have 3+ years in systems programming, compiler development, or performance More ❯
end performance challenges in AI inference stacks. Tech Stack & Focus Rust-first development (performance-critical systems, type-safety, modern tooling). Low-level systems programming (CPU/accelerator interaction, memorymanagement). Compiler design, functional programming concepts, and hardware/software co-design. We’re Excited If You Have 3+ years in systems programming, compiler development, or performance More ❯
secure, clean coding practices. Investigate and resolve production issues efficiently. Required Skills & Experience 5+ years in professional software development. Expert in C++11 or later , with a strong grasp of memorymanagement, OOP, and concurrency. Skilled in AngularJS , HTML/CSS , and JavaScript . Experience with REST APIs , JSON, and version control (Git). Solid understanding of Agile frameworks More ❯
secure, clean coding practices. Investigate and resolve production issues efficiently. Required Skills & Experience 5+ years in professional software development. Expert in C++11 or later , with a strong grasp of memorymanagement, OOP, and concurrency. Skilled in AngularJS , HTML/CSS , and JavaScript . Experience with REST APIs , JSON, and version control (Git). Solid understanding of Agile frameworks More ❯
Strong understanding of embedded hardware & driver concepts. Strong understanding of software and computer architecture concepts. Strong understanding of operating system concepts such as tasks, signals, timers, priorities, deadlocks, stacks, memorymanagement, etc. Experience with JTAG-enabled devices and software debugger, with excellent debugging skills. Desirable : Working knowledge of cryptography and secure protocols. Qualcomm MSM and AMSS development experience. More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
driver components to deliver them for most recent Linux kernels and yet-to-be-published Android versions. This involves developing performance-critical driver for GPU hardware, including scheduling and memorymanagement for Linux and Android OSs. You will provide the foundations that will make the Mali GPU implementation of Vulkan, OpenGL and OpenCL simply the best in the More ❯
various environments, including resource-limited devices and complex multi-modal systems. Your responsibilities include designing robust inference pipelines, establishing performance metrics, and troubleshooting bottlenecks to achieve low-latency, low-memory AI performance in real-world applications. Responsibilities : Design and deploy efficient model serving architectures optimized for diverse environments, including resource-constrained devices. Set and monitor performance targets such as … latency, throughput, and memory usage. Conduct inference testing in simulated and live environments, tracking key performance indicators and documenting results. Prepare high-quality datasets and scenarios for real-world deployment testing, focusing on low-resource devices. Analyze pipeline efficiency, diagnose bottlenecks, and optimize for scalability and reliability. Collaborate with cross-functional teams to integrate optimized frameworks into production, ensuring … related field; PhD preferred, with a strong publication record in AI R&D. Proven experience in kernel and inference optimization on mobile devices, with measurable improvements in latency and memory footprint. Deep understanding of model serving architectures, low-latency techniques, and memorymanagement in resource-constrained environments. Expertise in CPU/GPU kernel development for mobile platforms More ❯
materials, and knowledge sharing. Technology Environment STM32 Microcontrollers Zephyr RTOS with C++ abstraction layer Jira, Bitbucket, Jenkins, TestRail, Automated Build Servers Communications protocols: SPI, I2C, CAN, UART, WirelessHART Power management, bootloaders, DMA, flash memorymanagement What You’ll Need Degree in Computer Science, Embedded Systems, or related discipline. Minimum 3 years of hands-on experience in embedded More ❯
London, England, United Kingdom Hybrid / WFH Options
ZedTalent
materials, and knowledge sharing. Technology Environment STM32 Microcontrollers Zephyr RTOS with C++ abstraction layer Jira, Bitbucket, Jenkins, TestRail, Automated Build Servers Communications protocols: SPI, I2C, CAN, UART, WirelessHART Power management, bootloaders, DMA, flash memorymanagement What You’ll Need Degree in Computer Science, Embedded Systems, or related discipline. Minimum 3 years of hands-on experience in embedded More ❯
more reliable with every release What we’re looking for: Strong experience with Python, particularly in embedded or hardware-heavy environments Solid grasp of systems-level concepts: concurrency, networking, memorymanagement Experience working with hardware integrations, serial protocols, or device control Confident debugging in real-world environments (scopes, logs, traces – whatever gets the job done) Bonus if you More ❯
more reliable with every release What we’re looking for: Strong experience with Python, particularly in embedded or hardware-heavy environments Solid grasp of systems-level concepts: concurrency, networking, memorymanagement Experience working with hardware integrations, serial protocols, or device control Confident debugging in real-world environments (scopes, logs, traces – whatever gets the job done) Bonus if you More ❯
new product innovations that continue to set AWS's services and features apart in the industry. As a member of the UC organization, you'll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS. Within AWS UC, Amazon Dedicated Cloud (ADC) roles engage with AWS customers who require … administering and managing multiple relational database engines (e.g., Oracle, MySQL, SQLServer, PostgreSQL) - Working knowledge of relational database internals (locking, consistency, serialization, recovery paths) - Systems engineering experience, including Linux performance, memorymanagement, I/O tuning, configuration, security, networking, clusters and troubleshooting. - Coding skills in the procedural language for at least one database engine (PL/SQL, T-SQL More ❯
systems (Kafka, PostgreSQL, Redis, etc.). Familiarity with WebAssembly, WebRTC, or browser-based real-time playback is a big plus. Performance profiling, SIMD optimization, GPU encoding (NVENC, VAAPI), and memorymanagement experience. Comfortable with DevOps workflows, CI/CD, containerization, and cloud deployment. Experience in immersive media: 3D video, 6DoF capture, spatial audio, etc. Prior contributions to open More ❯
systems (Kafka, PostgreSQL, Redis, etc.). Familiarity with WebAssembly, WebRTC, or browser-based real-time playback is a big plus. Performance profiling, SIMD optimization, GPU encoding (NVENC, VAAPI), and memorymanagement experience. Comfortable with DevOps workflows, CI/CD, containerization, and cloud deployment. Experience in immersive media: 3D video, 6DoF capture, spatial audio, etc. Prior contributions to open More ❯
and product owners Proven and relevant experience in the games industry Required Tech Skills: Strong programming background in C# or C++ Excellent Unity knowledge Extensive experience in performance tuning, memorymanagement, and debugging for mobile applications Solid grasp of architecture patterns ( ECS, MVVM, etc ) Nice to Have Skills: Understanding of the mobile games ecosystem, experience shipping free to More ❯