Open side-bar Menu
 Real Talk
Dr. Pranav Ashar
Dr. Pranav Ashar
Dr. Pranav Ashar is chief technology officer at Real Intent. He previously worked at NEC Labs developing formal verification technologies for VLSI design. With 35 patents granted and pending, he has authored about 70 papers and co-authored the book ‘Sequential Logic Synthesis’.

The Future of 3D Technologies is Fast and Heterogeneous

 
August 27th, 2015 by Dr. Pranav Ashar

With the slow down in Moore’s law, technologists are now speculating on what future integrated circuits will look like.  One constraint is the clock frequency of CMOS processors,  which is topping out at around 4GHz for high-end processors in the 100W range, down to around 1-2GHz for ~5W processors used in laptop and mobile applications. With this constraint on clock speed, IC designers are adding more cores to increase processing throughput. Along with these additional processors is an increasing need for easy access to high-speed memory. Performance will not be achieved if multiple processors are contending for shared memory access.

One solution to this challenge are new 3D-manufacturing technologies in combination with new chip architectures to overcome the bandwidth-latency barrier in high-count multi-core chips.

The following will be the key enablers for 3D manufacturing:

3D multi-layered SRAM, DRAM, Flash and “resistive” memory manufacturing technology is making rapid progress. The memory density per unit area is increasing fast enough that it will soon get beyond the memory-per-core threshold (low 100s MB to 1 GB) required so that individual threads are not memory bound.

One example is the most recent announcement by Samsung regarding their 3D V-NAND technology. They have gone from a 128GBit design with a gate-stack of 24 cells in 2013, to the current production of 256GBit memory using a stack of 48 cells.  Samsung expects to achieve 1 TBit density by the year 2017 and employ a stack that is over 100 cells high.  In a previous blog, Graham Bell talked about the new XPoint 3D memory announcement from Intel/Micron that sets new levels of high-speed throughput.

samsung-3d-v-nand-stack

Figure 1. The 128GBit Samsung 3d V-NAND memory has 24 vertical cells in the gate stack.

Besides advancements in die manufacturing, companies are making rapid progress in allowing heterogeneous semiconductor layers to be integrated vertically. Logic, interconnect, memories and analog will be a mix of semiconductor types and feature densities that are best suited to their purpose and at the right cost point.  For example, you could have a 40nm GaAs optoelectronic  interconnect layer sandwiched between 28nm CMOS logic and 14nm CMOS memory layers for inter-processor communication and to maintain cache coherence. This dis-integration of memory from the logic layer will yield higher-performance than can be achieved with a monolithic approach.

The density of vertical connectors is a key metric for achieving higher-bandwidth and lower-latency communication. As this increases, and indications are that this is happening rapidly, the following benefits will accrue:

  • 3D-manufacturing yields will improve
  • Available bandwidth will increase
  • Energy per operation will decrease

If we reflect on the changes that will happen in the software running on these ICs, 3D technologies enable a new architecture for operating systems. The memory bandwidth of CPU threads will be very high and the latency time to access memory and other threads with both be low and predictable. In today’s vocabulary, this will be a Non-Uniform Memory Architecture.  Memory will be accessed as a uniform but segmented address space with a bandwidth and latency penalty if a threads accesses a memory location outside its segment. The use of virtual memory will be a thing of the past.

Our use of hard disks in our computers will go the way of magnetic tape, since RAM density will go through the roof.

Fundamental computations like Boolean Satisfiability and logic simulation which are used in our EDA software tools, and that have not yet fully benefited from the potential of multi-core parallelism will start to see dramatic speedup. Graph traversal algorithms that fundamentally lack memory locality will also benefit measurably.

I can’t wait for these developments to occur. And there is no mystery in achieving these breakthroughs. They are coming, and the only question that remains is when.

Related posts:

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

CST Webinar Series



Internet Business Systems © 2016 Internet Business Systems, Inc.
595 Millich Dr., Suite 216, Campbell, CA 95008
+1 (408)-337-6870 — Contact Us, or visit our other sites:
TechJobsCafe - Technical Jobs and Resumes EDACafe - Electronic Design Automation GISCafe - Geographical Information Services  MCADCafe - Mechanical Design and Engineering ShareCG - Share Computer Graphic (CG) Animation, 3D Art and 3D Models
  Privacy Policy