Vaishnav Gorur, Sr. Applications Engineer
Prior to joining Real Intent, Vaishnav was a logic design engineer at MIPS Technologies where he was responsible for the microarchitecture and RTL Design of the Load-Store Unit and Graduation Unit of the 15-stage out-of-order asymmetric dual-issue superscalar pipeline in the MIPS32® 74K® fully … More »
Part One: Clock and Reset Ubiquity – A CDC Perspective
January 17th, 2013 by Vaishnav Gorur, Sr. Applications Engineer
Imagine yourself as a nanoscopic sized tourist standing in the heart of a bustling metropolis that is today’s SoC. Wide-eyed and eager, you are looking to hit the sights immediately. Apart from the obvious choices – the processor core in downtown SoC and the network and I/O interfaces on the outskirts, you want to swing by the clock tower at the intersection of Oscillator St. and PLL Ave., rub noses with the ADCs and DACs, take pictures with the Flash and hang out late at the power controller while most of the city sleeps. Lots of architectures to marvel at and lots of memories to cherish – SRAM, DRAM, you name it. Alas, you are short on time and are looking for the best way to get to all of these attractions. Chances are that your trip guide tells you to hop on the clock net, or if that doesn’t tickle your fancy, jump on a reset net. The recommendation is a good one. The clock and reset distribution networks are far-reaching and traverse all components of the SoC. They are, in effect, ubiquitous.
As it turns out, getting those clock and reset nets in place is no trivial pursuit. Clock and reset tree synthesis is a highly complex step in the back-end physical design flow. It involves building a well-balanced distribution network that traverses across the SoC feeding all sequential elements while meeting aggressive timing parameters, signal integrity and power dissipation specifications. Prior to being handed off to physical design teams, clock and reset schemes are subject to sweeping verification suites to ensure correct functional behavior.
The existence of multiple clock domains in an SoC adds a new dimension to this already complicated verification process. The area of clock domain crossing (CDC) verification has grown tremendously over the past few years and has cemented its place in mainstream verification flows. This blog series takes a look at clock and reset nets from a CDC verification perspective, features real-life examples of silicon re-spins caused by clock and reset CDC failures, and illustrates how Real Intent’s Meridian CDC can help detect, isolate and resolve these insidious clock and reset-related CDC issues.
SoCs and the Magnitude of the CDC Problem
Over the past decade, with geometries constantly shrinking, the industry has gradually progressed from ASIC design to SoC design. Sub-systems fabricated as individual ASICs in previous generations of products are now being integrated into one system chip. The multi-fold benefits of higher speed and greater efficiency at reduced chip sizes with lower costs are clearly driving this transition towards smaller process nodes.
Today’s SoC integrates a collection of peripherals, memory, graphics, networking and I/O components that originate from a multitude of sources. It could comprise designs from within the company, from other companies or from third-party IP vendors. These independently developed components come together to enable a rich feature set for the SoC. However, accompanying this abundance of features is a significant amount of complexity that needs to be correctly and efficiently handled to render the integration successful. One such source of complexity is that components operate at clock frequency ranges that may be very different from those of their counterparts. The existence of these multiple clock domains and the need for them to exchange information creates a hotbed for CDC bugs to thrive.
CDC bugs typically lurk at the crossroads of bad design implementation, overlooked timing paths and incomplete verification. As shown in Figure 1, if the signal crossing from one asynchronous domain to another arrives too close to the receiving clock edge, the captured value is nondeterministic due to setup/hold time violation. This state is known as metastable state and results in incorrect values being propagated downstream, causing functional errors. The failure signatures are unpredictable and intermittent making them very hard to detect and diagnose via simulation or in the lab.
Contemporary designs contain a considerable number of clock domains creating thousands of crossings. This causes a significant uptick in the verification effort required to sign-off on CDC. The subtlety of CDC issues and the sheer volume of crossings often allow CDC bugs to slip through undetected to tapeout. Unfortunately, they frequently result in failures in the field requiring expensive silicon re-spins that end up costing companies tens of millions of dollars and lost market opportunities.
*** Next time we will look at the Evolution of Clocks and Resets ***