Open side-bar Menu
 EDACafe Editorial
Roberto Frazzoli
Roberto Frazzoli
Roberto Frazzoli is a contributing editor to EDACafe. His interests as a technology journalist focus on the semiconductor ecosystem in all its aspects. Roberto started covering electronics in 1987. His weekly contribution to EDACafe started in early 2019.

Moore’s Law extension a key theme at the 2021 VLSI Symposia

 
June 5th, 2021 by Roberto Frazzoli

New materials, new transistor structures, new integration schemes: many of the boost scaling options being investigated by research teams around the world will be represented at the 2021 Symposia on VLSI Technology & Circuits – running as a virtual event from June 13 to 19. Just as a teaser, in this article we will briefly summarize a handful of papers from the Technology program, to give a taste of some current research trends.

Benefits of forksheet over nanosheet

One of the papers presented by Belgian research institute Imec is meant to demonstrate the benefits of forksheet transistors over nanosheet transistors for CMOS area scaling. Forksheet devices are lateral nanosheet devices with a forked gate structure. The physical separation of N- and PFETs by a dielectric wall enables N-P space scaling and consequently sheet width maximization – compared to a N-P nanosheet configuration – for the same footprint. According to Imec, forksheet transistors offer additional benefits in the manufacturing process. Firstly, for nanosheets the high mask aspect ratio is challenging for patterning a well-defined N-P boundary over the full stack height. Secondly, the pWFM (Work Function Metal) lateral etch in-between NMOS nanosheets can lead to mask undercut and therefore pWFM removal from PFETs. For forksheet devices, the mask aspect ratio at the N-P boundary is substantially lower because the mask edge lands on top of the wall. In addition, the risk of pWFM removal from PFETs due to mask undercut is eliminated by the physical separation of the pWFM on either side of the wall, including along the gate trench side walls. Electrostatic control for forksheets and nanosheets is comparable.

Nanosheet (left) vs forksheet (right) comparison. Source: 2021 VLSI Symposia

Read the rest of Moore’s Law extension a key theme at the 2021 VLSI Symposia

Higher density DRAM alternative; faster simulations; new chip prototyping program

 
May 27th, 2021 by Roberto Frazzoli

Chip shortage and foundry activity continue to make headlines. Tesla is reportedly considering paying in advance for chips to secure its supply, and is said to be even exploring the acquisition of a semiconductor fab. GlobalFoundries is reportedly working with Morgan Stanley on an initial public offering that could value the foundry at about $30 billion. Let’s now move to some process technology and EDA updates.

Vertical nanowire-based memory promises 4X DRAM density without special materials

Singapore-based Unisantis unveiled the developments of its Dynamic Flash Memory (DFM) technology at the recent IEEE International Memory Workshop. According to the company, DFM offers faster speeds and higher density when compared to DRAM or other types of volatile memory. DFM is also a type of volatile memory, but since it does not rely on capacitors it has fewer leak paths, and it has no connection between switching transistors and a capacitor. The result is a cell design with the potential for significant increases in transistor density. Additionally – as it offers ‘block erase’ like a Flash memory – DFM reduces the frequency and the overhead of the refresh cycle and can deliver significant improvements in speed and power compared to DRAM. Based on TCAD simulation, Unisantis claims that DFM can potentially achieve a 4X density increase compared to DRAM. So, while the scaling of DRAM has almost stopped at 16Gb, DFM could be used to build 64Gb memory devices. Unisantis points out that unlike the so-called ‘emerging memory technologies’ (MRAM, ReRAM, FRAM, PCM), its Dynamic Flash Memory does not require using additional materials on top of a standard CMOS process. DFM was developed by Unisantis with the principles of its patented surround gate transistor (SGT) technology, also referred to in the semiconductor industry as a vertical nanowire transistor. According to the company, the benefits of this technology include improved area density, compared to planar and FinFET transistors; reduced leakage power, due to the strong electrostatic control of the surrounding gate to the transistor channel; and the possibility of optimizing the transistor width and length for different power/performance combinations. Unisantis is working on SGT technology in collaboration with Belgian research institute Imec.

DFM structure. Credit: Unisantis

Read the rest of Higher density DRAM alternative; faster simulations; new chip prototyping program

Chip lead times; Samsung EUV lines in Austin; Google 3D videoconferencing; data-driven algorithm design

 
May 20th, 2021 by Roberto Frazzoli

Chip shortage and new fab plans continue to be hot topics this week, while there is no shortage of AI news – with Google announcing the next generation of TPUs, and Edge Impulse expressing an interesting concept about machine learning bound to replace code writing in algorithm design.

Chip lead times reach 17 weeks

According to a research by Susquehanna Financial Group, quoted by Bloomberg, chip lead times – the gap between order and delivery – increased to 17 weeks in April. That is the longest wait since the firm began tracking the data in 2017. Specific product categories reported even longer lead times: 23.7 weeks in April for power management chips, about four weeks more than a month earlier; industrial microcontrollers also showed a worsened situation, with order lead times extended by three weeks. Automotive chip supply continues to be a pain point, with NXP reportedly having lead times of more than 22 weeks now – up from around 12 weeks late last year – and STMicroelectronics to more than 28 weeks. This situation is raising concerns of ‘panic ordering’ that may lead to market distortions in the future.

Read the rest of Chip lead times; Samsung EUV lines in Austin; Google 3D videoconferencing; data-driven algorithm design

IBM’s 2nm chip; EDA updates; AI updates; acquisitions

 
May 13th, 2021 by Roberto Frazzoli

Catching up on some of the news from the last four weeks or so, the IBM 2-nanometer announcement definitely stands out as a major update. Several recent news also concerns EDA, as well as AI accelerators. Two of the newest updates about AI startups will translate into an additional $150 million pumped into this industry by investors.

IBM’s 2-nanometer chip

As widely reported by many media outlets, last May 6 IBM announced the development of the world’s first chip with 2-nanometer nanosheet technology. The result was achieved by IBM research lab located at the Albany Nanotech Complex in Albany, NY, where IBM scientists work in collaboration with public and private sector partners. According to the company, IBM’s new 2-nanometer chip technology will achieve 45 percent higher performance, or 75 percent lower energy use, than today’s most advanced 7-nanometer node chips. Reporting about the announcement, EETimes underlined that this chip is the first to use extreme-ultraviolet lithography (EUV) for front-end of line (FEOL) processes. Other details reported by EETimes include the use of bottom dielectric isolation to eliminates leakage current between nanosheets and the bulk wafer; and a novel multi-threshold-voltage scheme. Reportedly, IBM expects 2-nanometer foundry technology based on this work to go into production towards the end of 2024.

2 nm technology as seen using transmission electron microscopy. Courtesy of IBM.

Read the rest of IBM’s 2nm chip; EDA updates; AI updates; acquisitions

A closer look at Cadence’s new Palladium Z2 Enterprise Emulation and Protium X2 Enterprise Prototyping systems

 
May 7th, 2021 by Roberto Frazzoli

As gate count of advanced chips gets bigger and bigger, design teams need more powerful emulation and prototyping systems to reduce time-to-market. Cadence, for its part, is responding to this need with the introduction of its Palladium Z2 Enterprise Emulation and Protium X2 Enterprise Prototyping systems, the latest generation of a coordinated solution that the company has dubbed “Dynamic Duo”. Let’s now take a closer look at these two new systems with the help of Paul Cunningham, Senior Vice President, System & Verification Group at Cadence, who recently gave a video interview on this topic to Sanjay Gangal from EDACafe.

EDACafe interviews Paul Cunningham, Senior Vice President, System & Verification Group at Cadence

Doubled capacity, 50% performance increase

The key to improved performance and capacity – compared to the previous generation of these systems, Palladium Z1 and Protium X1 – is the adoption of new processing engines. “They are powered by two different chips,” Cunningham explained. “Palladium Z2 is powered by a custom ASIC we actually built here at Cadence, (…) and Protium X2 is based on a massive capacity, leading edge Xilinx FPGA, the VU19P.” As Cunningham pointed out, Cadence has built two entirely new platforms around these chips, with new rack and new boards, achieving significant results: “Within the same rack footprint [as the previous generation], we are doubling the capacity per rack and we are increasing the performance by 50%. So there’s a very significant uplift in both these platforms.”

Read the rest of A closer look at Cadence’s new Palladium Z2 Enterprise Emulation and Protium X2 Enterprise Prototyping systems

New AI architectures in the spotlight at Linley Spring Processor Conference 2021

 
April 29th, 2021 by Roberto Frazzoli

Cerebras’ new 2.6 trillion transistors wafer scale chip is one the announcements made during the 2021 edition of the Linley Spring Processor Conference, a virtual event organized by technology analysis firm The Linley Group from April 19 to 23. In our quick overview of the conference we will focus mainly on new product announcements, which include innovative AI intellectual property from startups Expedera and EdgeCortix, a new approach to clock distribution from Movellus, and more. But first, let’s briefly summarize the opening keynote given by Linley Gwennap – Principal Analyst of The Linley Group – who provided an updated overview of AI technology and market trends.

A variety of AI acceleration architectures

Gwennap described the different AI processing architectures that the industry has developed over the past few years. While many CPUs, GPUs, and DSPs include wide vector (SIMD) compute units, many AI accelerators use systolic arrays to break the register-file bottleneck. Also, convolution architecture optimized for CNNs have been proposed: examples include processors developed by Alibaba and Kneron. Within AI-specialized architectures, many choices are possible: a processor can use many little cores, or a few big cores. Extreme examples are Cerebras with its wafer-scale chip integrating over 400,000 cores (850,000 in the latest version), and Groq with one mega-core only. Little cores are easier to design, while big cores simplify compiler/software design and are better for real-time workloads. Another architectural choice is between multicore versus dataflow: in a multicore design, each core executes the neural network from start to finish, while in a dataflow design the neural network is divided across many cores. An additional architectural style – that goes ‘beyond cores’ – is Coarse-Grain Reconfigurable Architecture (CGRA), which uses dataflow principles, but instead of cores, the connected blocks contain pipelined compute and memory units. This approach has been adopted by SambaNova, SimpleMachines, Tsing Micro and others. So the industry now offers a wide range of AI-capable architectures, ranging from very generic to very specialized. In general terms, a higher degree of specialization translates into higher efficiency but lower flexibility.

Read the rest of New AI architectures in the spotlight at Linley Spring Processor Conference 2021

EDA startup Avishtech innovates PCB stack simulation and loss modeling of PCB transmission lines

 
April 22nd, 2021 by Roberto Frazzoli

Selecting the right construction for a PCB stack and meeting the tight loss budget of PCB transmission lines are major challenges for designers and manufacturers of high-frequency printed circuit boards. According to Avishtech – a young San Jose-based provider of innovative EDA solutions – traditional EDA tools fall short of needs in those two areas, often leading to a trial-and-error development process that translates into long design cycles and increased costs.

Avishtech started addressing these problems in 2019. “That’s when me and my partners saw an opportunity to really make an impact and actually do things in a very different way,”, said founder and CEO Keshav Amla in the video interview he recently gave to Sanjay Gangal from EDACafe. “We had the right backgrounds and we felt that we were the right people to do that.” So after completing his master’s degree, in 2019 Amla left his PhD program to work on Avishtech full-time. One year later, in July 2020, the company launched its Gauss product line: Gauss Stack, a PCB stack-up design and simulation solution, and Gauss 2D, a field solver that improves transmission line loss modeling. Let’s now take a closer look at Avishtech and at the recently announced latest versions of its tools.

Read the rest of EDA startup Avishtech innovates PCB stack simulation and loss modeling of PCB transmission lines

Nvidia’s datacenter CPU; fast AI training on x86; Siemens acquires OneSpin; EDA Q4 results

 
April 15th, 2021 by Roberto Frazzoli

Nvidia entering the datacenter CPU market – and becoming a direct competitor of Intel in this area – is definitely this week’s top news. Unrelated to this announcement, an academic research adds to the debate on heterogeneous compute. More updates this week include an important EDA acquisition and EDA figures; but first, let’s meet Grace.

Grace, the new Arm-based Nvidia datacenter CPU

Intel’s recently appointed CEO Pat Gelsinger is facing an additional challenge: defending the company’s datacenter CPU market share against Grace, the new Nvidia CPU – that promises 10x the performance of today’s fastest servers on the most complex AI and high performance computing workloads. Announced at the current GTC event and available in the beginning of 2023, the new Arm-based processor is named for Grace Hopper, the U.S. computer-programming pioneer.

In his GTC keynote, Nvidia CEO’s Jensen Huang explained that Grace is meant to address the bottleneck that still makes it difficult to process large amounts of data, particularly for AI models. His example was based on half of a DGX system: “Each Ampere GPU is connected to 80GB of super-fast memory running at 2 TB/sec,” he said. “Together, the four Amperes process 320 GB at 8 Terabytes per second. Contrast that with CPU memory, which is 1TB large, but only 0.2 Terabytes per second. The CPU memory is three times larger but forty times slower than the GPU. We would love to utilize the full 1,320 GB of memory in this node to train AI models. So, why not something like this? Make faster CPU memories, connect four channels to the CPU, a dedicated channel to feed each GPU. Even if a package can be made, PCIe is now the bottleneck. We can surely use NVLink. NVLink is fast enough. But no x86 CPU has NVLink, not to mention four NVLinks.” Huang pointed out that Grace is Arm-based and purpose-built for accelerated computing applications of large amounts of data – such as AI. “The Arm core in Grace is a next generation off-the-shelf IP for servers,” he said. “Each CPU will deliver over 300 SPECint with a total of over 2,400 SPECint_rate CPU performance for an 8-GPU DGX. For comparison, todays DGX, the highest performance computer in the world, is 450 SPECint_rate.” He continued, “This powerful, Arm-based CPU gives us the third foundational technology for computing, and the ability to rearchitect every aspect of the data center for AI. (…) Our data center roadmap is now a rhythm consisting of three chips: CPU, GPU, and DPU. Each chip architecture has a two-year rhythm with likely a kicker in between. One year will focus on x86 platforms, one year will focus on Arm platforms. Every year will see new exciting products from us. The Nvidia architecture and platforms will support x86 and Arm – whatever customers and markets prefer,” Huang said.

The NVLink interconnect technology provides a 900 GB/s connection between Grace and Nvidia GPUs. Grace will also utilize an LPDDR5x memory subsystem. The new architecture provides unified cache coherence with a single memory address space, combining system and HBM GPU memory.

The Swiss National Supercomputing Centre (CSCS) and the U.S. Department of Energy’s Los Alamos National Laboratory are the first to announce plans to build Grace-powered supercomputers. According to Huang, the CSCS supercomputer, called Alps, “will be 20 exaflops for AI, 10 times faster than the world’s fastest supercomputer today.”. The system will be built by HPE and come on-line in 2023.

Nvidia Grace CPU. Credit: Nvidia

Read the rest of Nvidia’s datacenter CPU; fast AI training on x86; Siemens acquires OneSpin; EDA Q4 results

New emulation and prototyping systems; Arm v9 architecture; low power FPGAs; HKMG DRAM; automotive startups; open source updates

 
April 8th, 2021 by Roberto Frazzoli

Google’s AI scientist Samy Bengio has reportedly resigned over a controversy with the company. Brother of Yoshua Bengio, another world-famous AI scientist, Samy joined Google in 2007 and was part of the TensorFlow team. Prior to that, Samy Bengio co-developed Torch, the ancestor of PyTorch. It will be interesting to see where he will be landing next. Let’s now move to some updates, catching up on some of the news from the last couple of weeks.

Cadence Palladium Z2 and Protium X2 systems

Cadence has introduced the Palladium Z2 Enterprise Emulation and Protium X2 Enterprise Prototyping systems, representing the new generation of the current Palladium Z1 and Protium X1. Based on new emulation processors and Xilinx UltraScale+ VU19P FPGAs, these systems provide – according to Cadence – 2X capacity and 1.5X performance improvements over their predecessors. Both platforms offer a modular compile technology capable of compiling 10 billion gates in under ten hours on the Palladium Z2 system and in under twenty-four hours on the Protium X2 system.

Cadence Palladium Z2 and Protium X2. Credit: Business Wire

Siemens’ new Veloce system

Siemens has unveiled its new Veloce hardware-assisted verification system, that combines virtual platform, hardware emulation, and FPGA prototyping technologies. The solution includes four new products: Veloce HYCON (HYbrid CONfigurable) for virtual platform/software-enabled verification; Veloce Strato+, a capacity upgrade to the Veloce Strato hardware emulator that scales up to 15 billion gates; Veloce Primo for enterprise-level FPGA prototyping; and Veloce proFPGA for desktop FPGA prototyping. Customer-built virtual SoC models can begin running real-world firmware and software on Veloce Strato+ for deep-visibility to the lowest level of hardware, then the same design can be moved to Veloce Primo to validate the software/hardware interfaces and execute application-level software while running closer to actual system speeds. Both Veloce Strato+ and Veloce Primo use the same RTL, the same virtual verification environment, the same transactors and models. A key technology in the upgraded Veloce platform is a new, proprietary 2.5D chip which – according to Siemens – enables a 1.5x system capacity increase over the previous Strato system.

Read the rest of New emulation and prototyping systems; Arm v9 architecture; low power FPGAs; HKMG DRAM; automotive startups; open source updates

Special report: EDA requirements in the design of AI accelerator chips

 
April 2nd, 2021 by Roberto Frazzoli

Innovative architectures, high performance targets, competitive market: does this AI cocktail call for specially optimized EDA solutions? We asked Prith Banerjee (Ansys), Paul Cunningham (Cadence), Mike Demler (The Linley Group), Jitu Khare (SimpleMachines), Poly Palamuttam (SimpleMachines), Anoop Saha (Siemens EDA)

Never before were silicon startups as numerous as they are today, in this era of ‘silicon Renaissance’ driven by an insatiable hunger for neural network acceleration. Startups engaged in the development of AI accelerator chips are raising considerable venture capital funding – and attracting a lot of attention from the media, as technology champions at the forefront of innovation. Not surprisingly, most EDA vendors have updated their marketing messaging to emphasize product offerings specifically tailored to the design needs of these devices, and AI startups seem to enjoy a privileged status among EDA customers in terms of coverage from vendors’ blogs and press releases. It is therefore interesting trying to figure out if AI accelerator chips really pose special design challenges calling for specially optimized EDA solutions.

AI chips: different or normal?

Apart from some notable exceptions – such as the devices based on analog processing, or the wafer-scale chip from Cerebras – it seems fair to assume that the vast majority of the AI accelerators being developed are digital and have a ‘normal’ die size. Is there anything special in these chips that makes them different from other complex processors from an EDA standpoint?  “The short answer is no,” says Paul Cunningham, Corporate Vice President and General Manager at Cadence. “I don’t think there is anything really fundamental that makes an AI chip different from other kinds of chips. But an AI chip is usually a very big chip and it’s highly replicated. So you have a basic building block, some kind of floating point MAC, and it’s replicated thousands, tens of thousands, hundreds of thousands of times. The nature of the design will stress the scalability of EDA tools to handle high replication. So in this sense, yes, it is important to make sure that our EDA tools have good performance on this style of design, but if there was another type of design which was also highly replicated, it would stress the tools in the same way.”

Paul Cunningham. Credit: Cadence

Read the rest of Special report: EDA requirements in the design of AI accelerator chips




© 2024 Internet Business Systems, Inc.
670 Aberdeen Way, Milpitas, CA 95035
+1 (408) 882-6554 — Contact Us, or visit our other sites:
TechJobsCafe - Technical Jobs and Resumes EDACafe - Electronic Design Automation GISCafe - Geographical Information Services  MCADCafe - Mechanical Design and Engineering ShareCG - Share Computer Graphic (CG) Animation, 3D Art and 3D Models
  Privacy PolicyAdvertise