By Louie De Luna, Agnisys Chief Product Evangelist
The conventional von Neumann architecture has been the workhorse of computing for several decades, but with the advent of AI applications and big data the entire industry has put a spotlight on its limitations. Since massive amounts of data need to travel back and forth between the CPU and memory, the resulting latency and power consumption became major issues. One of the powerful convolutional neural networks (CNN), Alexnet, requires 68M total weights (parameters) and 724M total MACs for a single inference process – a mere average requirement compared to other CNNs such as VGGNet which requires 138M total weights and 15.5G total MACs.
New chip architectures and technologies are now emerging to address these issues known as the “von Neumann bottleneck” or the “memory wall” problem. The Google TPU is based on systolic arrays that provides up to 420 Teraflops, the Graphcore IPU is based on Bulk Synchronous Parallel (BSP) technology that provides up to 125 Teraflops and IBM Zurich Lab is working on a new AI chip based on in-memory computing.
But as the world of computing and AI wait for the new chip architectures to mature, the memory wall problem is still a real pain. Startups without the backing of deep pockets will need to come up with other ingenious ways in order to be competitive.