EDACafe Editorial Roberto Frazzoli
Roberto Frazzoli is a contributing editor to EDACafe. His interests as a technology journalist focus on the semiconductor ecosystem in all its aspects. Roberto started covering electronics in 1987. His weekly contribution to EDACafe started in early 2019. AI-based macro placement; open-use LLMs; new silicon-compatible materials for AI applicationsMarch 30th, 2023 by Roberto Frazzoli
Artificial intelligence is the common underlying theme for most of this week’s updates. Among them, Nvidia is in the news with an EDA research work, after last week announcement concerning its solution for computational lithography – the last software step before mask production. Nvidia research on AI-based macro placement At the recent ISPD (International Symposium on Physical Design), a group of Nvidia researchers presented a paper on AI-based macro placement. The paper proposes AutoDMP, a methodology that leverages DREAMPlace, a preexisting open-source GPU-accelerated placer, to place macros and standard cells concurrently in conjunction with automated parameter tuning using a multi-objective hyperparameter optimization technique. As a result, the team could generate high-quality predictable solutions, improving the macro placement quality of academic benchmarks compared to baseline results generated from academic and commercial tools. According to the Nvidia researchers, AutoDMP is also computationally efficient, optimizing a design with 2.7 million cells and 320 macros in three hours on a single Nvidia DGX Station A100. The key contributions of the work include using multi-objective Bayesian optimization to search the design space of macro placements, targeting three PPA proxy objectives post-place: wirelength, cell density, and congestion; using a two-level PPA evaluation scheme to manage the complexity of the search space; and enhancing the DREAMPlace placer. Open-source benchmarks used include Ariane, a single core Risc-V CPU; the MemPool Group and BlackParrot designs, many-core Risc-V CPUs with large amounts of on-chip SRAMs; and an NVDLA partition. A previous research work on AI-based macro placement, from Google, had been criticized for not providing enough publicly available data and for comparing the AI performance to an unspecified human expert’s performance. The new Nvidia work seems to be able to withstand these types of criticism, as it includes details on benchmarking and compares the AI performance with a commercial EDA tool, Cadence Innovus. The work’s source code is released on GitHub.
Cerebras releases seven open-use, trained GPT-based large language models Cerebras, well known for its wafer-scale AI chips, has trained and is releasing a series of seven GPT-based large language models (LLMs) for open use by the research community. As Cerebras pointed out in a press release, this is the first time a company has used non-GPU based AI systems to train LLMs up to 13 billion parameters and is sharing the models, weights, and training recipe via the industry standard Apache 2.0 license. The seven models have 111M, 256M, 590M, 1.3B, 2.7B, 6.7B, and 13B parameters respectively. Typically a multi-month undertaking, the training work was completed in a few weeks thanks to the speed of the Cerebras CS-2 systems that make up the Andromeda supercomputer, and the ability of Cerebras’ weight streaming architecture to eliminate the pain of distributed compute. According to Cerebras, this release provides several benefits: among them, the pre-trained models can be tailored for industry specific applications with minimal work; and a new scaling law can be derived based on an open dataset, allowing researchers to predict how a given compute training budget translates to model performance. Cerebras considers these models as the demonstration of a “data-parallel only” approach to training, as opposed to LLM training on GPUs which requires a complex amalgam of pipeline, model, and data parallelism techniques. New silicon compatible materials enable low power neural networks Two unrelated research works have recently addressed the need for adapting new low power technologies to the requirements of conventional silicon processes, in an effort to reduce neural network power consumption. A team of scientists at King Abdullah University of Science and Technology (KAUST) has successfully integrated two-dimensional materials on CMOS silicon microchips, and achieved excellent integration density, electronic performance and yield. The KAUST team used a two-dimensional insulating material called ‘multilayer hexagonal boron nitride’, about 6 nanometers thick. The resulting memristors can be used to compute spiking neural networks. A research team at the University of Illinois Urbana-Champaign achieved the first material-level integration of ECRAMs (electrochemical random-access memory) onto silicon transistors, realizing the first practical ECRAM-based deep learning accelerator. ECRAM encodes information by shuffling mobile ions between a gate and a channel. Electrical pulses applied to a gate terminal either inject ions into or draw ions from a channel, and the resulting change in the channel’s electrical conductivity stores information. It is then read by measuring the electric current that flows across the channel. An electrolyte between the gate and the channel prevents unwanted ion flow, allowing ECRAM to retain data as a nonvolatile memory. The research team selected materials compatible with silicon processes: tungsten oxide for the gate and channel, zirconium oxide for the electrolyte, and protons as the mobile ions. Mobiveil’s controller IP for AP Memory’s pseudo-SRAM Mobiveil has adapted its PSRAM (pseudo-SRAM) controller IP to leverage the characteristics of AP Memory’s new PSRAM device that goes up to 250MHz in speed with densities from 64Mb to 512Mb, supporting x8/x16 modes. AP Memory’s PSRAM supports Octal Serial Peripheral Interface (Xccela standard) enabling speeds of up to 1000 Mbytes/s for a 16-pin SPI option. Mobiveil’s PSRAM controller provides support for a direct memory mapped system interface, automatic page boundary handling, linear/wrap/continuous/hybrid/ burst support, and low power features like deep and half power down. The combined solution mainly targets IoT applications. Different views on how the CHIPS Act should help rebuild the US packaging industry Two different industry groups have just published white papers with their recommendations on how to use the CHIPS Act subsidies to revitalize the U.S. semiconductor packaging industry. The two documents are significantly different from one another. The IPC document recommends a “do something” approach, maintaining that “sooner is better than perfect”, prioritizing the construction of a single IC substrate fabrication pilot line and observing that “spending more time planning, talking and debating will not get us to the desired competitiveness position more quickly.” The document from ASIC (American Semiconductor Innovation Coalition) specifically refers to the National Advanced Packaging Manufacturing Program (NAPMP) to be established within the context of the CHIPS Act, and recommends to structure this initiative around five pilot lines, each working on a different packaging form factor: wafer based, panel based, system in package (SiP), unit based and flex. Each pilot line would be supported by a specific “Coalition of Excellence”. |