Guest Blogger Sanjay Gangal
Sanjay Gangal is a veteran of Electronics Design industry with over 25 years experience. He has previously worked at Mentor Graphics, Meta Software and Sun Microsystems. He has been contributing to EDACafe since 1999. NVIDIA’s RISC-V Transformation: How Open Architecture Became Central to AI and ComputingNovember 6th, 2024 by Sanjay Gangal
Last week at the RISC-V Summit in Santa Clara, NVIDIA’s Vice President of Multimedia Architecture/ASIC, Frans Sijstermans, took to the stage to recount NVIDIA’s strategic transformation. For an audience of engineers, developers, and industry insiders, Sijstermans’ keynote was both an inside look at NVIDIA’s journey with RISC-V and a demonstration of how open architecture has fueled rapid technological growth. Over the past seven years, NVIDIA’s adoption of RISC-V has marked a major shift in its approach to microprocessor design, from the niche Falcon core to a globally scaled RISC-V integration that now spans billions of cores.
From Falcon to RISC-V: A Strategic LeapIn 2017, Sijstermans first announced NVIDIA’s plan to phase out Falcon, their proprietary 32-bit microprocessor, and adopt RISC-V. At the time, Falcon had been used for a decade, embedded across various applications. But as computing demands grew more complex, NVIDIA recognized the need for a more adaptable, extensible architecture. After evaluating multiple architectures, the company found RISC-V’s open, customizable foundation aligned with its goals for scalability, security, and performance. “Customization is really the key here,” Sijstermans emphasized. “As Moore’s Law slows, it becomes increasingly important to use every bit of silicon effectively. RISC-V’s customizable structure allows us to design specific extensions for our applications.” This customization became one of RISC-V’s most appealing attributes for NVIDIA, which has since been able to innovate rapidly within the flexible framework the architecture provides.
RISC-V’s Integration: Billions of Cores Across Dozens of ApplicationsToday, RISC-V processors are embedded within virtually every NVIDIA chip, totaling billions of cores shipped in 2024 alone. Each chip can house anywhere from 10 to 40 RISC-V cores, depending on the application. These cores manage tasks as varied as video decoding, resource allocation, and power management. Sijstermans shared that NVIDIA’s commitment to RISC-V has expanded far beyond the company’s original scope, covering a diverse array of functions integral to NVIDIA’s operations in AI, graphics processing, and data infrastructure. RISC-V has enabled NVIDIA to implement three core functionalities—resource management, power management, and security—across a wide range of hardware. The architecture’s flexibility also allows NVIDIA to design custom extensions specific to its data center needs. In these environments, RISC-V’s 64-bit addressing and scalable page sizes allow seamless integration into NVIDIA’s memory-intensive applications, essential for the vast, distributed data center networks where NVIDIA’s hardware operates. GPU System Processor (GSP): Transforming How GPUs Interact with DataOne of the keynote’s focal points was NVIDIA’s GPU system processor (GSP), a RISC-V-based solution that acts as a bridge between host systems and GPUs. The GSP manages data flow to the GPU, creating an abstraction layer that simplifies complex control processes. Instead of communicating directly with GPU control registers, the host processor now interacts with the GSP, which translates high-level commands into low-level register controls. This innovation has not only streamlined GPU operations but also opened up new possibilities for cloud-based applications. For cloud computing, the GSP’s role is particularly transformative. It enables isolation and resource management across multiple virtual machines, allowing each instance to run independently with its own dedicated resources. “We can guarantee quality of service by controlling resource allocation through the GSP,” Sijstermans explained. This isolation is essential in cloud environments, where security and confidentiality are paramount. Through the GSP, NVIDIA supports confidential computing, giving each user an isolated, secure environment without interference from other tenants on the same hardware. RISC-V’s Role in AI: Deep Learning Acceleration and Custom ProcessingAs a leader in AI and deep learning, NVIDIA has leveraged RISC-V to power the control layers within its deep learning accelerators. Sijstermans shared insights into how NVIDIA’s deep learning accelerator (DLA) employs a specialized 64-bit RISC-V vector unit, allowing it to handle complex matrix multiplications and nonlinear functions integral to neural network processing. NVIDIA’s DLA utilizes both standard and vectorized RISC-V cores to manage the complex data transformations required for AI inference. For example, RISC-V handles tasks like convolutional operations and non-linear activations, enabling the DLA to process large datasets and execute deep learning models with high efficiency. The vector unit’s 1024-bit processing capability allows NVIDIA to execute multiple data points simultaneously, significantly speeding up operations like SoftMax and other activation functions. This design is essential for applications that require real-time inference, such as autonomous driving and interactive AI systems. Enhanced Security: RISC-V and Custom Extensions in ActionSecurity has been a central theme in NVIDIA’s RISC-V journey. Through a combination of hardware and software innovations, NVIDIA has ensured that RISC-V processors in their architecture meet stringent security requirements. Sijstermans emphasized the importance of custom extensions, such as pointer masking, which mitigates potential vulnerabilities by restricting how memory pointers are handled. NVIDIA’s security framework also relies on a separation kernel, a minimal hypervisor-like system that isolates different components within a single core. This system can manage secure and non-secure applications within the same environment while maintaining strict separation. This flexibility allows NVIDIA to run sensitive functions, like asset tracking and safety-critical applications, alongside standard applications without compromising security. Building an Ecosystem: RISC-V as a Collaborative StandardNVIDIA has not only integrated RISC-V but has also become a contributor to its ongoing development. As one of the founding board members, NVIDIA has actively engaged in RISC-V technical workgroups, pushing for standards that align with its needs and enhance the architecture’s functionality. Sijstermans revealed that NVIDIA is a key player in RISC-V’s community-driven model, leveraging community resources for tools like compilers and simulators while also contributing its own advancements. A unique aspect of this ecosystem is NVIDIA’s collaboration with partners like AdaCore, which develops safety-certified Ada Spark compilers. These collaborations have allowed NVIDIA to tap into a growing repository of RISC-V resources, accelerating their internal development and ensuring they remain at the cutting edge of safety and performance in embedded systems. A Vision for the Future: One Architecture, Dozens of Applications, Billions of CoresReflecting on NVIDIA’s journey, Sijstermans emphasized that RISC-V’s success at NVIDIA is rooted in its ability to support a unified architecture across diverse applications. From AI accelerators to cloud computing, RISC-V provides a cohesive foundation that can be customized without fragmenting NVIDIA’s internal ecosystem. “We have one architecture, but it powers dozens of applications across billions of cores,” Sijstermans stated. This strategic decision to keep all cores under a single RISC-V architecture has minimized development overhead and maximized efficiency, even as NVIDIA’s core designs have evolved. The decision to adopt RISC-V was not without risks, but NVIDIA’s bet on open architecture has paid off in spades. For Sijstermans and his team, RISC-V represents more than an alternative to proprietary cores; it embodies a new paradigm of open, customizable, and community-driven innovation. “The RISC-V model, where we can freely design and implement our own extensions, allows us to push the boundaries of what’s possible in silicon,” he said, underscoring NVIDIA’s vision for the future of microprocessor design. With RISC-V now deeply embedded in NVIDIA’s product lines, the architecture is no longer a peripheral experiment—it is a cornerstone of NVIDIA’s innovation strategy. Sijstermans concluded his talk with a message for the RISC-V community: that customization, efficiency, and openness are the keys to the future of computing, especially as the industry continues to evolve. NVIDIA’s journey, from replacing Falcon to embedding billions of RISC-V cores in its hardware, stands as a testament to the transformative power of open standards in driving technological progress. Tags: customization, deep learning accelerator, GPU system processor, NVIDIA, open architecture, RISC-V |