Archive for the ‘Uncategorized’ Category
Thursday, February 28th, 2013
We just finished a BUSY week of activity at the Design and Verification Conference. The exhibits were open for 3 hours on Tues. and Wed. in the late afternoon. The floor was buzzing when the technical program was done for the day and the beverage bar helped to fuel everyone’s spirits. I think floor traffic may have been lighter than earlier years, but the floor layout was definitely more conducive for people to move around.
Vaishnav Gorur, Sr. Field Applications Engineer, at the Real Intent DVCon 2013 booth
Real Intent sponsored a panel, “Where Does Design End and Verification Begin?”, which had over 140 attendees listen to experts from ARM, Mentor Graphics, Intel, GarySmithEDA, and Real Intent. The discussion was lively and the moderator, Brian Hunter from Cavium, threw at least one gibe. I knew it was a very good panel when attendees were echoing the discussion at another panel later that day. I had my video camera recording the back-and-forth and will be posting clips from the panel in the weeks ahead on this blog. (more…)
Thursday, February 21st, 2013
Sequential clock gating is a relatively newer, more complicated technique that involves the identification of enable signals based on analysis that spans multiple clock cycles. By examining the design across sequential boundaries, advanced power optimization tools can identify data dependencies, observable don’t care conditions and unused states. Based on this information, they then formulate enable conditions to shut off the clock to groups of flip-flops in the design. Sequential clock gating typically provides higher power savings compared to its combinational counterpart as it has the potential to turn off more registers for a larger number of clock cycles.
Figure 5. XOR self-gating
A practical example of a sequential clock gating scheme is to turn off subsequent banks of pipeline registers based on the propagated value of the enable signal in the current pipeline stage. This concept can be boiled down further and restricted to a single flop.
Note that it is possible to gate the clock to a flop based on the output of the flop in the previous cycle and the incoming data value. A simple XOR of the output and incoming input signal can be used as the enable signal for the clock gater as depicted in Figure 5. This technique is employed in power optimization tools. (more…)
Thursday, February 14th, 2013
I must admit that I am having a blast seeing all the Field reports that mention prospective customers’ impressions of Real Intent’s Ascent Lint product— three in just the past week. The typical comment is that it “easy to use and intuitive.” As we built Ascent Lint from the ground up three years ago, usability was kept in mind starting from day one.
When deciding on both the command and the graphical user interface, I take the perspective of a user who has never used our tool before and doesn’t want to learn a new tool environment. I pretend I want to have everything at my fingertips, just like royalty. On the Ascent Lint engineering team, we think ahead of all the “what-if” debug scenarios so the end-user can focus on their design and fix actual problems.
The user interface is just one factor contributing to Ascent Lint being “easy to use”. Its next generation engine delivers the highest performance and the highest capacity available in the market. No other tool has been able to beat the speed and capacity of Ascent Lint which is up to 50X faster than other products. It delivers an almost unbelievable runtime of less than an hour for a 450 million gate design run from chip level flat.
What does that mean in terms of usability? Well, it means the user does NOT need to learn a new flow for next generation designs that are of hundreds of million gates. They are free from writing complex multi-iteration hierarchical scripts to get around the limitations in tools that cannot support these designs. It avoids the need to run multiple lint jobs at different levels of the hierarchy and eliminates the added noise that is always present when merging multiple analysis reports into a single one. The raw performance and capacity advantages of Ascent Lint lead to significant process simplification and reduction in cost, time and resources. (more…)
Wednesday, February 13th, 2013
Combinational clock gating is a relatively straightforward technique of disabling the clock to registers when the register output does not change. This involves identifying combinational logic conditions that cause a register to hold its previous value and using it as an enable signal for the clock pin instead.
Figure 3. Combinational clock gating
Opportunities to insert combinational clock gating can be identified by power-aware RTL synthesis tools based on the analysis of the combinational cone of logic between registers. Figure 3 illustrates a transformation involved in combinational clock gating. Once the clock gating insertion is complete, a logical equivalence checking tool is employed to ensure that the resulting design with clock gates inserted is indeed functionally equivalent to the original design. The synthesis tool, however, is not clock domain crossing aware and might perform optimizations that violate CDC principles at the boundary interfaces between clock domains. Consider the following situation where the two clocks clkA and clkB are asynchronous.
Thursday, February 7th, 2013
Quick, what’s two plus two? The answer is four, of course, assuming you have enough bits to compute the entire result. When writing code in Verilog, this is not always a safe assumption.
In Verilog, each expression is determined to have a specific bit length. For the logic expressions (e.g. and, or, xor) and arithmetic expressions (e.g. add, subtract, multiply), if the expression is on the right-hand-side of an assignment, then the length is determined by the size of the result on the left-hand-side. If the expression isn’t in an assignment, or appears in a concatenate or index range, then its length is determined by its largest argument. Literal integers, specified without a length, are 32 bits long.
For example, in the assignment:
qq[2:0] = aa[1:0] + bb[1:0]
The assignment is to a three-bit value, so the add operation of aa and bb is determined to be three bits wide. But in the statement:
$display(“This value is %x”, mem[ aa[1:0] + bb[1:0] ]);
the add operation here is two bits long, as it’s largest operand is two bits. If aa and bb are both 2’d2, then the carry bit is discarded and the result of the add here is 2’d0. (The details of how expression lengths are determined are spelled out in the Verilog Language Reference Manual, IEEE-1364.)
Thursday, January 31st, 2013
Real Intent’s Cool Giveaway at DVCon
Besides our usual exhibit at the Design and Verification Conference in San Jose at the end of February, Real Intent has organized a Panel and a ½ day Tutorial which I think highlights some of the changes happening in our industry, and that may have been overlooked.
The Panel discusses the interesting topic “Where Does Design End and Verification Begin?” The abstract states that design and verification are “joined at the hip” as the initial spec for simulation and architectural exploration leads to an RTL model and finally a gate-level implementation. It claims that the verification flow applies assertions, testbenches, timing constraints and automation methods as the design devolves. Are they completely entwined? From what I have seen, design teams typically see a boundary between those that write the RTL code and those that verify it. Is there a clean hand-off between D and V? And what is the best practice for the industry? I look forward to hearing what the moderator, Brian Hunter of Cavium Networks and the panelists John Goodenough (ARM), Oren Katzir (Intel), Harry Foster (Mentor), Pranav Ashar (Real Intent) and Gary Smith (GSEDA) will say about these questions on the morning of Wed. Feb. 27.
Secondly, the ½ day Tutorial on “Pre-Simulation Verification for RTL Sign-off” presents the toolset that surrounds the traditional dynamic simulation and timing analysis used by engineers. The integration of heterogeneous IP and design units into an SOC require confirmation of protocols, power budgets, testability and the correct operation of multiple interfaces and clock domain crossings (CDC). Simulation can theoretically be used to fully test an SOC but the cost of complete RTL testing is beyond what design teams can afford. To reduce cost and the risk of missing critical tests, abstract modeling and pre-simulation static analysis of RTL have become imperative in SOC design flows.
Thursday, January 24th, 2013
The Evolution of Clocks and Resets
Traditional clock and reset mechanisms typically were based on a master clock and reset distributed throughout the chip. Today, the die size is large enough that it is impractical to distribute the same fast clock to all parts. In addition, power management dictates that there be multiple VDD and clock domains on the chip that can be turned on and off independently. Clock frequencies in communicating domains (asynchronous or not) can differ by an order of magnitude and clock frequencies are allowed to vary dynamically based on throughput requirements or for power optimization. The proliferation of gated clocks for power optimization has introduced new tools in the design flow. The process of adding clock gates traditionally done by logic designers is now being automated to ensure that no power savings are left on the table. The premise for usage of gated clocks is that there is no modification of the original functionality. Logical equivalence checking tools need to be employed in the design flow to ensure this is indeed the case.
The large amount of clock gating, the variety of clock gating techniques, the nontrivial control circuitry involved and the likelihood that most of it is automatically inserted by a synthesis tool or power optimization tool further complicates verification. The implementation of power-up reset is also more complex today as it is designed to optimize power and physical layout. It is imperative that clock and reset schemes be comprehensively verified prior to analysis of the rest of the design. Empirical evidence suggests that a lot of issues initially diagnosed as being control or datapath-related are eventually traced to improper clock and reset behavior.
Exacerbating the verification problem is the fact that synthesis and power optimization tools are not glitch-aware and that there exists a distinct possibility that glitch-susceptible logic is inserted during optimization phases. This suggests that verification tools for clock and reset analysis be operable at the RT level as well as at the gate level. The following section details some real-life examples of the issues mentioned.
Thursday, January 17th, 2013
Imagine yourself as a nanoscopic sized tourist standing in the heart of a bustling metropolis that is today’s SoC. Wide-eyed and eager, you are looking to hit the sights immediately. Apart from the obvious choices – the processor core in downtown SoC and the network and I/O interfaces on the outskirts, you want to swing by the clock tower at the intersection of Oscillator St. and PLL Ave., rub noses with the ADCs and DACs, take pictures with the Flash and hang out late at the power controller while most of the city sleeps. Lots of architectures to marvel at and lots of memories to cherish – SRAM, DRAM, you name it. Alas, you are short on time and are looking for the best way to get to all of these attractions. Chances are that your trip guide tells you to hop on the clock net, or if that doesn’t tickle your fancy, jump on a reset net. The recommendation is a good one. The clock and reset distribution networks are far-reaching and traverse all components of the SoC. They are, in effect, ubiquitous.
As it turns out, getting those clock and reset nets in place is no trivial pursuit. Clock and reset tree synthesis is a highly complex step in the back-end physical design flow. It involves building a well-balanced distribution network that traverses across the SoC feeding all sequential elements while meeting aggressive timing parameters, signal integrity and power dissipation specifications. Prior to being handed off to physical design teams, clock and reset schemes are subject to sweeping verification suites to ensure correct functional behavior.
The existence of multiple clock domains in an SoC adds a new dimension to this already complicated verification process. The area of clock domain crossing (CDC) verification has grown tremendously over the past few years and has cemented its place in mainstream verification flows. This blog series takes a look at clock and reset nets from a CDC verification perspective, features real-life examples of silicon re-spins caused by clock and reset CDC failures, and illustrates how Real Intent’s Meridian CDC can help detect, isolate and resolve these insidious clock and reset-related CDC issues. (more…)
Thursday, January 10th, 2013
Some lint rules point out things that are probably wrong, something missing, or code that may not do what you intended. Others enforce naming and coding rules to make code more clear, consistent, and easier to maintain. MIN_ID_LEN is in the later group. It checks all names in your design to make sure they are at least as long as the minimum length you specify, and reports when it sees a name that is too short.
How often have we written code like the following:
for (i = 0; i < N; i = i+1) …
This ubiquitous for statement works just as well in Verilog, SystemVerilog, C, and C++. But consider what happens when the block that follows uses the variable ‘i’ several times, and then sometime later someone needs to search through it to find out what’s happening or to debug some problem.
When I write code for either hardware design or programming, I’ve developed the habit of using at least two characters for identifier names, as in:
for(ii =0; ii < NN; ii++) …
Thursday, January 3rd, 2013
Happy New Year! In this first blog post of 2013, I am relaying some thoughts and opinions that have come out over the last month from Real Intent.
Prakash Narain, President and CEO, at Real Intent was asked in the Dec. 2012 issue of System Level Design (SLD) what he saw as the big issues and developments in design over the next 12 to 24 months. Here is his answer:
“On-chip complexity will see a major jump in the next silicon node. The 20nm node may have 14nm transistors, or semi companies may skip 20nm and go straight to 14nm. Either way, there will be significantly more transistors in the next node than in the typical transition. Impact will be felt on EDA tools across the board. The largest impact will be on verification tools that don’t have the luxury of a sliding scale of Quality-of-Result (QoR). Today, we are seeing 450M gate SoC designs for Clock Domain Crossing (CDC) verification. With the new nodes and the demand for ever-greater feature sets and performance by consumers, 1G-gate designs will be the top end in 12 to 24 months. The challenge will be to scale verification tools to handle these designs and give signoff-accurate results in hours and not days or weeks.”
Dr. Pranav Ashar, CTO at Real Intent gave his thoughts to Ed Sperling of SLD in his Adventures In Verification story on the how both hierarchical and flat methodologies play a role in SoC verification. He believes a marriage of both approaches is the best. Here is a short excerpt from the story concerning this issue: (more…)