Will “programmed logic” (that is, GPUs and deep learning accelerators) give way to “hard-wired logic” in artificial intelligence applications? Taalas, a startup recently emerged from stealth, has no doubt about that (see the news below). Meanwhile, programmed logic keeps advancing – with Cerebras doubling down on its wafer-scale approach and launching a four trillion transistor chip. Other news this week, besides Taalas, contribute to the feeling that the end of geometrical scaling won’t stop IT advancements. That includes chiplet-based solutions, of course, but also new transistor types.
Hard-wired AI models promise a 1000x improvement in computational power and efficiency
Toronto-based Taalas has recently exited stealth mode and raised $50 million dollars over two rounds of funding led by Pierre Lamond and Quiet Capital. The company’s mission is to develop an automated flow for rapidly implementing all types of deep learning models (transformers, SSMs, diffusers, MoEs, etc.) in silicon. According to the company, proprietary innovations enable one of its chips to hold an entire large AI model without requiring external memory. Taalas claims that the efficiency of hard-wired computation enables a single chip to outperform a small GPU-based data center, opening the way to a 1000x improvement in the cost of AI. “The path forward is to realize that we should not be simulating intelligence on general purpose computers, but casting intelligence directly into silicon. Implementing deep learning models in silicon is the straightest path to sustainable AI,” said Ljubisa Bajic, Taalas’ CEO. Prior to co-founding Taalas, Bajic founded Tenstorrent in 2016.
Intel outlines a UCIe-3D solution
In a paper recently published on Nature Electronics, a team of Intel researchers propose a solution for using the UCIe standard in the three-dimensional integration of chiplets. According to the authors, their architectural approach provides power, performance and reliability characteristics approaching or exceeding that of a monolithic system-on-chip design as the bump pitch approaches 1 µm. Research findings include that – contrary to trends seen in traditional signalling interfaces – the most power-efficient performance for these architectures can be achieved by reducing the frequency as the bump pitch goes down. The Intel vision is that two chiplets will connect using multiple independent modules, with each UCIe-3D PHY directly controlled by the Network-on-Chip controller. To realize this vision, the authors anticipate challenges in the areas of cooling, power delivery and reliability. Advances in electronic design automation will be necessary, too.