Graham is VP of Marketing at Real Intent. He has over 20 years experience in the design automation industry. He has founded startups, brought Nassda to an IPO and previously was Sales and Marketing Director at Internet Business Systems, a web portal company. Graham has a Bachelor of Computer … More »
SoC Verification: There is a Stampede!
May 14th, 2015 by Graham Bell
In the stories of the Wild West from the 1800s, the image of a cattle drive often is depicted. A small team of cowboys delivers thousands of heads of cattle to market. The cowboys spend many days crossing open land until they reach their destination – one with stock yards to accept their precious herd, and a rail station to deliver it quickly to market. Along the way there are dangers, including losses by predators and mad stampedes by cattle rushing blindly when frightened or disturbed. The primary job of the cowboys is to keep the herd on track and settled as they move to ship-out.
I see immediate parallels between the cowboys of the Wild West and today’s system-on-chip (SoC) design and verification engineers. Cowhands struggle to control and move a big herd. Similarly, today’s design teams grapple with how to keep a project on target and converging to tape-out and success when the gate count of SoCs has become so large it can stretch and even overwhelm their ability to stay on track. How big are these new SoCs?
The Xbox One gaming console, for example, uses 5 billion transistors, which is equivalent to 1.25 billion digital gates. Its AMD-designed SoC produced at TSMC on a 28-nm process combines eight Jaguar CPU cores and Graphics Core Next (GCN)-class integrated graphics. (See Figure 1.)
Another example, pictured on the left, is Nvidia’s GK110 GPU (also made on TSMC’s 28-nm process), which has 7.1 billion transistors. This translates to nearly 2 billion digital gates. These are not just big chips but giant chips!
With each smaller semiconductor node foundries provide, more gates can be squeezed into the same die size. In parallel, many different kinds of design blocks and intellectual property (IP) are employed, usually created by third-parties, to accelerate the implementation of the design objectives. The interaction of the various blocks across various power and timing conditions adds a new kind of complexity to the design. The result is a “herd” of interfaces with thousands of different crossings that must be checked and verified to ensure the design does not run off into a fatal operating condition.
It would be great to have the luxury of several hundred design and verification engineers to verify all possible failures in these giant SoCs, but that is not usually the case. Typically a small team relies on design automation software to manage the complexity of the verification challenge.
For each interface in the SoC, signals cross asynchronously between the various IPs and must be registered correctly to ensure the integrity of the digital signal path and eliminate metastability errors. For bus-level signals, circuitry such as a FIFO manages the data transfer and verification to ensure there is no data overflow or underflow that could compromise the design. This approach requires a full-chip clock domain crossing (CDC) analysis.
Design teams need three elements to achieve overnight CDC analysis runs for functional sign-off – precision, throughput and ease of use. (See Figure 2.)
Precise analysis means the software must accurately capture all possible interfaces in the design, including buses; provide reset analysis, including glitches in both asynchronous and synchronous domains; and correctly handle crossings that may be blocked by environment definition. Once the analysis is done, it is essential to be able to verify the interfaces automatically, using formal technologies, so all possible failure conditions can be exhaustively covered.
Likewise, throughput has three important considerations: runtime, capacity and methodology. Design analysis must be done in overnight runs to make the necessary progress to stay on schedule. In terms of capacity, a terabyte of computer memory no longer is needed to verify a 500-million gate design. Instead, teams can use more standard hardware. For giga-scale designs, a hierarchical methodology is needed to leverage block-level CDC signoff for chip-level CDC verification. This methodology is effective for sign-off only if the SoC verification makes no approximations or abstractions. Only then can it truly ensure no signal crossing errors are missed.
Ease of use is the third major aspect of CDC analysis for functional sign-off. The software setup must be easy and automated to ensure the quality of results. The various kinds of analysis including formal analysis must generate results without the user writing any tests. Finally and perhaps most important, the debug of analysis results must be hierarchical and fully customizable. This kind of flexibility is available typically only from a full database of analysis results. Graphical and command-line interfaces must be able to extract the necessary reports in a variety of formats and with the data organized as required for any specific verification flow requirements. Whether using HTML docs or custom spreadsheets, the design and verification team should be able to “rope-in” any interface issue.
SoC verification poses many challenges through the sheer size of designs and the various mix of design IP, each operating with its own clocking scheme. Successful SoC design teams will meet the challenge of clock domain crossing verification with a solution that provides the necessary precision, throughput and ease-of-use they need. This approach will avoid a stampede of errors and late debugging that will delay the ship-out of their designs.
This blog article was originally published on EETimes SoC Designlines.