We have seen that FPGAs are really ideal for the speed aspect being able to execute in real time or close to real time. In the sixteen years I have been looking at emulators, the speed of the devices and the speed of the systems have grown tremendously but the emulators are stuck at about 1 MHz. We’ve even got new novel architectures that people have come out with in the last couple of years. Eve, for example, you could argue is a different architecture for simulators. It modestly improves the speed on real designs, maybe 2 MHz rather than 1 MHz. The bottom line is that emulators typically offer very high price tags and very limited speed with which you can exercise the designs or simulate the designs.
With FPGA based prototypes on the other hand, the cost per gate of the FPGAs themselves whether single or multiple has been declining very sharply. Xilinx has published some data on data. Whether single or multiple FPGAs we work with a set of companies that we call Partners in Prototyping (PIP). There are about 10 of these board companies: HARDI Electronics in Sweden, GiDEL, ProDesign, Nallatech, SK-Electronic and the Dini Group. These companies have essentially created a very nice market for themselves, about a $50 million market today. It is interesting if you look at where the emulation market is. According to Dataquest it is now down to about a $60 million per year market. That market has essentially shrunk over the last four years from about $150 million per year down to $60 million. The price tag is $1 million or more for a Palladium or Mentor emulator. You are effectively looking at a market where fifty or maybe fewer units go out each year. It addresses very deep pocketed firms with long term and longer time frame products. The guys who are developing the 15 month or 12 month ASIC turn around cycle doing .18 micron or .13 micron even moving now finally into 90 nm are the guys that are essentially unserved by that market. They don’t have the money and frankly speaking they don’t have the time. FPGA based prototypes have grown quite naturally to address that kind of market. We are pleased with the growth of our partners. The speed and cost aspect of FPGAs we think if anything is going to become a better proposition now that boards with Vertex5 are starting to come out. It is of course Stratix III is coming out with cost per function in tremendous decline.
If we want to make FPGAs effective replacements for emulators we really need to bring the visibility into them. We need some way of capturing full signal visibility across the design as well as full state visibility. Whether it is DSP block content, memory blocks in the design or all of the signal values that’s what we have set out to develop.
Coming to the technology, what we are doing is taking two things as part of TotalRecall. First, we have developed technology that provides full visibility inside of either a single FPGA or multiple FPGA boards. The second thing as part of this we really feel that assertions should be more heavily exercised within the RTL verification process. To do so we have developed capabilities for bringing those assertions into hardware and being able to exercise them at FPGA speeds. We see a lot of advantages to that especially as tied into the overall TotalRecall methodology. To highlight what TotalRecall is all about, we think it addresses the key limits we talked about and then add to existing methods. The way it basically works at a high level is that we use the FPGA or multiple FPGAs. We do not limit this to ASICs. This is for ASIC verification, FPGA verification, any sort of digital IC: ASIC, FPGA, SoC and so forth. The method is capable once an event has happened, once a trigger is reached, once an assertion is fired, to go backwards in time (that’s the clever bit). The user can essentially dial in how far back and then recreate the sequence of events in detail leading up to that known bug happening. Once the assertion triggers or once the bug is detected, this method goes back in time and recreates the sequence of events that led up to it. We automatically bring over into the simulator as a test bench with all the initialization information the user can either execute directly within the simulator or do a kind of hardware in the loop type of approach where they go between the simulator and the hardware board for things like single stepping and so forth. Once the design is analyzed, the bug is detected, analyzed and a fix proposed, the fix can then be verified back in the live running FPGA hardware using exactly the sequence of events that had led up to the bug in the first place especially bugs that are sporadic or very infrequent interaction of hardware and software. Those are the ones this approach nails down. You don’t have any guess work of having to create a testbench. You are actually using real stimulus and live events that led up to that bug and be able to use that same information to verify that the fix is correct.
To go into more detail on this what happens is that the RTL source for the design (ASIC or FPGA) is brought into the FPGA, brought into the HDL simulator. Of course the user would also reference from the RTL set their trigger conditions, watch points, break points and so forth. What happens is that as the design is running, a trigger event will occur whether an assertion or another form of trigger. When the trigger event happens we reach backwards, create a testbench that contains all these initialization values in the design, all the states, the block content and so forth. We are then tracing through with the stimulus in real time that lead up to that event. When brought into the simulator this will enable that information to be replayed over an over and analyzed to engineer’s heart’s content.
The short form of this is that this is a fast forward button for the simulator. It brings you to where the bug occurs, the interesting bit. It provides all the information the simulator needs to debug that event.
Editor: My sense is that the system stores and delays the application of the stimulus to the replicated logic. When an event occurs, the replicated logic and memory buffer and paused, their contents extracted and saved. Ther is nothing running truly backwards.
One of my colleagues over in the UK said that it solves the Friday night bug. You are ready to go home on a Friday night. You are already late for dinner. Your wife is calling. You need to get the hell out of there. You gently close the door and go home for the weekend. That’s exactly when the bug occurs. You come in Monday morning and you have a dead simulator in front of you. With this approach if that bug happened, when you come in Monday morning there is a live simulator window waiting with all the detailed info for you to see when that bug happened and be able to trace through the sequence that led to it. We think it is a very effective approach for getting to the root cause of something, getting at it very quickly.
The other capability we think is closely tied into this is the ability to use assertions. We think assertions are only as good as the stimulus that drives them because assertions cover a temporal range. You have to have a time element in your verification to really see assertions run. Often times it is not a good use of your simulator because you have to have the simulator act as your bookkeeper. How many times did this event happen? What was the frequency over some range of time or the bus value that occurred at this specific point? What is the comparison to some other reference value? These sorts of functions are best done on hardware. Hardware is better, faster and more efficient bookkeeping mechanism to be able to see such things. You do not have to take your simulator and slow it way down by applying these. So people do use assertions with simulations today but typically they are limited. You are not going to use a simulator for things that need a larger range of time. As an example consider developing a cell phone. You’ve got the early specification or the system architect said early on that there should be more than 5 reads from memory during the bootup sequence. That kind of design wisdom, design knowledge or behavioral intent would not be captured in RTL simulation because it takes more than 7 days to exercise that, to go through the bootup sequence. You need to create a testbench to create the stimulus for exercising that. Using TotalRecall that same test for the assertion could be done within 10 seconds. Rather than have to create a fictitious testbench, it is actually in system test using real system stimulus. We see it as being a method that can allow assertions to be much more powerfully used in the actual hardware design process.