Guest Blogger Sanjay Gangal
Sanjay Gangal is a veteran of Electronics Design industry with over 25 years experience. He has previously worked at Mentor Graphics, Meta Software and Sun Microsystems. He has been contributing to EDACafe since 1999. EDACafe Industry Predictions for 2025 – Alif SemiconductorJanuary 9th, 2025 by Sanjay Gangal
By Reza Kazerounian and Camron Kazerounian AI at the edge is too slow and power-hungry today: 32-bit microcontrollers need a major AI upgrade But if this vision is to become a reality, manufacturers of microcontrollers (MCUs) will need to re-engineer their products, tailoring them for the power, cost, and size constraints common in the embedded world. AI: the tool for handling growing data volumes at the edge The number of devices deployed at the edge is growing, and the amount of data that each produces is also growing fast. This is driving the need for on-device AI processing. As the global datasphere continually expands – it is forecast to quadruple to close to 600 zettabytes (ZB) by 2030 – the sheer volume of data threatens to overwhelm traditional cloud computing systems.
Fig. 1: a smart watch can use edge AI to provide health insights on the basis of vital sign data. (Image credit: Jens Mahnke) Because embedded devices have highly constrained resources, there are big advantages to performing AI processing locally, at the edge:
By executing AI inferencing operations faster at the edge, embedded devices can enable some applications that would be impossible if reliant on cloud servers. Personal health monitoring products, for instance, that detect heart arrhythmia or that recognize changes in blood pressure and other vital signs need to perform rapid data analysis if they are to alert users to potential health risks in time for them to take effective action. Latency introduced by cloud processing could prevent the device from providing the warning that would have led to early and successful intervention. The same need for an immediate response with very low latency applies to predictive maintenance systems that use AI in industrial machinery. In the consumer world, augmented reality (AR) smart glasses have the same need for immediate response. The user notices if the display of on-demand or context-aware information about the environment – such as identifying landmarks or translating text in real-time – is not immediate. These examples show why real-time, secure, and low-power AI data processing needs to take place at the edge. The types of endpoint AI applications and models In response to the move to bring AI processing to the endpoint, AI scientists have accelerated the development of ML model optimizations. These have been made with the aim of shrinking the memory footprint of ML models while avoiding impairing the accuracy of inference: the techniques include model pruning, compression, and quantization. Moreover, developers are creating special models tailored for specific applications running at the edge. These include:
Fig. 2: edge AI powers predictive maintenance to reduce downtime and accelerate repairs of industrial equipment The ambition of embedded device manufacturers goes further, to embrace the potential to use LLMs for natural language processing for applications such as voice interfaces: this means that the MCUs in edge devices will one day need to support this technology. LLMs would also support ambitious features such as real-time translation of foreign language text of speech – a highly valuable application for devices such as smart glasses. Making such devices capable of recognizing voice commands in natural speech or providing context-aware interpretation of places or objects could be similarly transformative.’ The added value available to embedded device manufacturers is, then, clear to see. But how to implement these new kinds of AI at the edge without also scaling up power consumption and cost? How new MCUs should adapt to the requirements of edge AI The reality today is that MCUs today are failing to meet the requirements even of the ML models in use in embedded devices operating at the edge today, let alone the increased demands that are to be placed on them as ML models evolve in the future. Embedded device manufacturers are missing out on opportunities to exploit the full potential of AI because today’s MCUs and systems-on-chip (SoCs) consume too much power, are too large, and perform inferencing functions too slowly. Some progress is already being made: pioneers including Alif Semiconductor have introduced new families of MCU that include a neural processing unit (NPU) and other features intended for use in AI inferencing. Products such as the Ensemble MCUs and the Balletto Bluetooth MCUs provide high-performance, low-power operation in voice, image, and motion analysis processing, particularly when the application deploys CNNs (see Figure 3). But looking beyond today’s implementations of AI at the edge, there is huge untapped potential in the use of new transformer-based SLM models, as well as in newer types of neural network including Graph Neural Networks (GNNs) and Spiking Neural Networks (SNNs). The implementation of these is beyond even the best of today’s AI-optimized MCUs. Fig. 3: the Balletto wireless MCU enables sophisticated voice and audio AI functions in wearable devices So what is needed in a new generation of MCUs that is missing today? MCUs for AI will evolve in three particular directions. The first is in the integration of advanced NPUs, providing a dedicated hardware resource that is optimized for AI inferencing. To exploit the potential of transformer-based models and emerging types of neural network, embedded devices will need in future to offer performance of better than 1 tera-operations per second (TOPS), while keeping power consumption so low that small battery-powered products such as earbuds can operate for at least a day before the battery needs recharging. The second change is to the provision of memory to support AI operations in an MCU. As AI systems running models such as SLMs evolve, they will need to draw on much more memory capacity, both memory embedded on the chip, and external memory. In response, MCU manufacturers must provide interfaces to external memory that are faster and operate at lower power. New MCU architectures will also change to support new, faster pipelines for moving data within the MCU. The third change is in integration: MCUs are able to operate as SoCs because of the wide range of functions that they integrate, but integration needs to be extended to support the operation of AI systems with more sensors, while keeping the footprint of an edge node as small as possible. If the MCU industry can rise to the challenge of AI at the edge, it could profit from a sea change in demand for compute power, as people’s experience of AI operations shifts from high-energy CPU- and GPU-powered hubs to personal digital devices based on an MCU. This turns upside down today’s assumptions about how people interact with technology, reducing their dependence on smartphone apps and internet services provided by a PC. Instead, more functions will be provided by a multitude of smart devices which are embedded in the environment and which communicate with each other to provide services such as health monitoring, energy management, and security and surveillance. The decentralized intelligence provided by many embedded devices at the edge will replace the centralized model in which most technology services are provided via the smartphone. To realize this new vision, the MCU industry needs to provide a new generation of products based on architectures optimized for AI and ML functions operating with low latency and at low power. About Author: Reza Kazerounian is an innovative leader who has been recognized for building and managing multibillion-dollar businesses in the Automotive and Industrial sectors, as well as IoT solutions. He is well known for his work in the areas of microcontrollers, embedded processing, and connectivity. Prior to co-founding Alif Semiconductor, Reza was the Senior Vice President and General Manager of the Microcontroller and Connectivity Business Unit at Atmel Corporation until its acquisition in 2016. Prior to Atmel, Reza served as SVP and GM of Freescale’s Automotive, Industrial, and Multi-Market Solutions Product Groups with annual revenue exceeding $2.7B, accounting for over 65% of corporate revenue. Under Reza’s responsibility, this business consisted of the Microcontroller, Connectivity, MEMS sensors, and Analog Product divisions. Prior to Freescale, Reza served key appointments at STMicroelectronics. He served as the CEO of STM Americas region until March 2009. Prior to this, he was the Group Vice President and GM of the Smart Card Security and Programmable Systems Memory Divisions at STM corporate based in Geneva, Switzerland. Reza holds a B.S. degree from the University of Illinois, Chicago and a Ph.D. from the University of California, Berkeley, all in Electrical Engineering and Computer Sciences. RelatedTags: AI at the edge, AI-powered health monitoring, edge AI applications, embedded devices, low-latency processing, microcontrollers, natural language processing, neural processing units, power-efficient MCUs., predictive maintenance, transformer models Category: EDA Predictions This entry was posted on Thursday, January 9th, 2025 at 10:25 am. You can follow any responses to this entry through the RSS 2.0 feed. You can skip to the end and leave a response. Pinging is currently not allowed. |