Dr. Pranav Ashar, CTO
Dr. Ashar brings two decades of EDA expertise to Real Intent. He previously worked at NEC Labs (Princeton, NJ) developing formal verification technologies for VLSI design. He authored about 70 papers and co-authored a book titled "Sequential Logic Synthesis." He has 35 patents granted and pending, … More »
Clock Domain Verification Challenges: How Real Intent is Solving Them
July 19th, 2010 by Dr. Pranav Ashar, CTO
With chip-design risk at worrying levels, a verification methodology based on just linting and simulation does not cut it. Real Intent has demonstrated that identifying specific sources of verification complexity and deploying automatic customized technologies to tackle them surgically has benefit. Automatic and customized don’t go together at first glance. Whereas automatic deals with maximizing productivity in setup, analysis and debug, customized ensures comprehensiveness. That’s the challenge for clock-domain verification as well as for the plethora of other failure modes in modern chips. Clock-domain verification is certainly a case in point. Its complexity has grown tremendously:
Signal crossings between asynchronous clock domains: The number of asynchronous domains approaches 100 for high-end SOCs optimizing performance or power. The chip is too large to distribute the same clock to all parts. Also, an SOC is more a collection of sub-components, each with its own clock. Given the large number of domains and crossings, the myriad protocols for implementing the crossings, and the corresponding large number of failure modes, writing templates to cover all scenarios is very expensive. Template-based linting on such chips with millions of gates is very slow – takes days. Additionally, the report from a template-based analysis is so voluminous as to challenge the ability of the team to analyze it manually, causing real failures to be overlooked.
Widely disparate and dynamic clock frequencies: Analyzing for data-integrity and loss in crossings under all scenarios is non-trivial and beyond linting alone.
Proliferation of gated clocks: Power management and mode-specific gated clocks are now common, introducing a manifold verification problem. (1) Clock setup must be correct for meaningful verification. Detailed setup analysis highlights errors in clock distribution or the environment spec. (2) Functionally verify the designs with gated clocks. (3) The variety of gated clock implementations creates a variety of glitching possibilities. Clock glitches are very hard to diagnose. You want to know about this possibility as early as possible. Given the variety of gated-clock types and glitching modes, a template-based approach is a recipe for productivity loss and slow analysis.
Reset distribution: Power-up reset is much more complex now to optimize for power and routing. Full verification of the reset setup prior to subsequent analysis is essential.
Timing optimization: Optimizations like retiming may violate design principles causing glitch potential at the gate-level even if there was none in RTL. Glitch analysis must be an integral part of verification and the tool must operate on RTL as well as gates. Template methods make it harder since multiple templates may be required to support RTL and gate as well as mixed languages.
Clock distribution: Previously 2nd-order issues like clock jitter in data/control transfers have more impact in DSM. Even synchronous crossings must now be designed carefully and verified comprehensively.
Full-chip analysis: Speed, scalability, precision and redundancy-control become key considerations in full chip analysis with many hierarchy levels and 100 million gates.
Real chip respins are revealing: (1) Asynchronous reset-control crossing clock domains but not synchronously de-asserted, caused a glitch in control lines to an FSM. (2) Improper FIFO-protocol controlling an asynchronous data crossing caused read-before-write and functional failure. (3) Reconvergence of non-gray-coded synced control signals to an FSM caused cycle jitter and an incorrect transition. (4) Glitch in a logic cone on an asynchronous crossing path that was latched into the destination domain corrupting captured data. (5) Gating logic inserted by power-management tools resulted in clock glitch.
CDC verification is not solved adequately by simulation or linting. It has become a true showstopper and an effective solution is a must have. Real Intent’s approach understands the failure modes from first-principles to develop symbiotic structural and formal methods to cover them comprehensively and precisely. Structural and formal methods combine to check the clock & reset setup, metastability errors, glitching, data integrity / loss and signal de-correlation. This approach allows us to auto-infer designer intent and checks for the crossing or clock/reset distribution. As a result, our structural analysis runs 10x faster and does not require the designer to develop templates. Formal methods analyze for failures under all scenarios efficiently and comprehensively without a laborious enumeration of scenarios. For example our free-running-clock feature checks for data-loss for all frequency ratios. We complete the solution with an automatic link to simulation that models metastability and adds checks in the testbench. These solutions are offered in Real Intent’s Meridian product family.