EDACafe Editorial Roberto Frazzoli
Roberto Frazzoli is a contributing editor to EDACafe. His interests as a technology journalist focus on the semiconductor ecosystem in all its aspects. Roberto started covering electronics in 1987. His weekly contribution to EDACafe started in early 2019. Hard-wired AI models; UCIe in 3D packages; reconfigurable FETs; Samsung’s HBM3; Silicon Box in ItalyMarch 14th, 2024 by Roberto Frazzoli
Will “programmed logic” (that is, GPUs and deep learning accelerators) give way to “hard-wired logic” in artificial intelligence applications? Taalas, a startup recently emerged from stealth, has no doubt about that (see the news below). Meanwhile, programmed logic keeps advancing – with Cerebras doubling down on its wafer-scale approach and launching a four trillion transistor chip. Other news this week, besides Taalas, contribute to the feeling that the end of geometrical scaling won’t stop IT advancements. That includes chiplet-based solutions, of course, but also new transistor types. Hard-wired AI models promise a 1000x improvement in computational power and efficiency Toronto-based Taalas has recently exited stealth mode and raised $50 million dollars over two rounds of funding led by Pierre Lamond and Quiet Capital. The company’s mission is to develop an automated flow for rapidly implementing all types of deep learning models (transformers, SSMs, diffusers, MoEs, etc.) in silicon. According to the company, proprietary innovations enable one of its chips to hold an entire large AI model without requiring external memory. Taalas claims that the efficiency of hard-wired computation enables a single chip to outperform a small GPU-based data center, opening the way to a 1000x improvement in the cost of AI. “The path forward is to realize that we should not be simulating intelligence on general purpose computers, but casting intelligence directly into silicon. Implementing deep learning models in silicon is the straightest path to sustainable AI,” said Ljubisa Bajic, Taalas’ CEO. Prior to co-founding Taalas, Bajic founded Tenstorrent in 2016. Intel outlines a UCIe-3D solution In a paper recently published on Nature Electronics, a team of Intel researchers propose a solution for using the UCIe standard in the three-dimensional integration of chiplets. According to the authors, their architectural approach provides power, performance and reliability characteristics approaching or exceeding that of a monolithic system-on-chip design as the bump pitch approaches 1 µm. Research findings include that – contrary to trends seen in traditional signalling interfaces – the most power-efficient performance for these architectures can be achieved by reducing the frequency as the bump pitch goes down. The Intel vision is that two chiplets will connect using multiple independent modules, with each UCIe-3D PHY directly controlled by the Network-on-Chip controller. To realize this vision, the authors anticipate challenges in the areas of cooling, power delivery and reliability. Advances in electronic design automation will be necessary, too.
Korean academic AI chip runs GPT-2 on 400 milliwatts A research team at the Korea Advanced Institute of Science and Technology (KAIST) has reportedly developed an AI chip capable of processing the GPT-2 large language model with a power consumption of 400 milliwatts and a speed of 0.4 seconds. According to the report, the 4.5-mm-square chip – built with Samsung Foundry’s 28 nanometer process – has 625 times less power consumption compared with Nvidia’s A-100 GPU and is also 41 times smaller. Key to these achievements is a combination of deep neural networks and spiking neural networks. Reducing transistor count with no-doping reconfigurable FETs Researchers from TU Wien (Vienna University of Technology, in Vienna, Austria) have reported advances in the development of circuits based on Reconfigurable Field-Effect Transistors (RFETs). Unlike normal FETs, an RFET combines n- and p-type operation in a single device thanks to “electrostatic doping” – that is, electric charge is introduced via an additional electrode, which determines how the transistor should behave. Avoiding the usual chemical doping means that the transistor behavior is not pre-determined by the fabrication process, and can therefore be switched during operation. Taking advantage of run-time reconfiguration, the team has succeeded in building fundamental logic circuits – an inverter, as well as NAND/NOR and XOR/XNOR gates – reducing transistor count compared to conventional circuits with static transistors. For example, a XOR gate can be built with only four RFETs and can be inverted to XNOR operation at run-time. The RFETs being studied at TU Wien are potentially compatible with the CMOS process. Customizing a Risc-V core with a configurator tool Spain-based Semidynamics has released its new tool called Configurator, enabling users to speed up full customization of the company’s Risc-V processor IP. The tool uses dozens of blocks that have already been verified by Semidynamics, so that the final core is also pre-verified. First, the Configurator provides an easy way for the user to specify the configuration parameters for the IP. Second, it allows to describe the additional changes required. This description is then sent to the Semidynamics engineering team for implementation. Some of the choices offered by the Configurator tool include instruction and data cache sizes, main memory bus size and type, and eight optional extensions. Additional options include Semidynamics’ Tensor and Risc-V 1.0 compliant Vector Units with a choice of number of cores and data configuration. Silicon Box to build a plant in Italy Singapore-headquartered packaging company Silicon Box has announced its intention to collaborate with the Italian government to invest up to $3.6B (€3.2B) in Northern Italy, as the site of a new, state-of-the-art semiconductor assembly and test facility. When completed, the new facility will support approximately 1,600 Silicon Box employees in Italy. Design and planning for the facility will begin immediately, with construction to commence pending European Commission approval of planned financial support by the Italian State. The exact location of the new plant has not been chosen yet. Samsung’s growing role in the HBM3 market According to market research firm TrendForce, HBM3 memories currently represent a supply bottleneck for Nvidia because of both CoWoS packaging constraints and their inherently long production cycle. The current HBM3 supply for Nvidia’s H100 solution is primarily met by SK hynix, leading to a supply shortfall. Samsung HBM3 offering, however, is growing and has received AMD MI300 series certification. TrendForce, therefore, expects Samsung to rapidly gain market share. Samsung’s progress in the HBM3 market could also be fueled by a technology change which would allow the company to reach a higher production yield. According to Reuters, the memory maker is planning to switch from non-conductive film (NCF) technology to the mass reflow molded underfill (MR-MUF) method, already used by rival SK hynix. This is what some sources have inferred from the fact that Samsung has recently issued purchase orders for MUF chipmaking equipment. Samsung, however, has denied these rumors. EVs wireless charging reaches 100 kilowatts A team of researchers at Tennessee-based Oak Ridge National Laboratory has demonstrated that a light-duty passenger electric vehicle can be wirelessly charged at 100 kilowatts with 96% efficiency using polyphase electromagnetic coupling coils with rotating magnetic fields. ORNL’s system – with coils just over 14 inches in diameter – transferred power to a Hyundai Kona EV across a five-inch airgap. According to ORNL, this technology reaches power densities eight to ten times higher than conventional coil technology. |