This year’s fall edition of the Linley Processor Conference – held on November 1 and 2 in Santa Clara, California – was, as usual, a good observation point to keep abreast of trends and products in neural network acceleration. In this article we will provide a very quick overview of part of the conference, focusing on the keynote given by Linley Gwennap and on the presentations from the companies that addressed AI acceleration topics. The event, of course, offered many more presentations concerning ‘conventional’ (non-AI) processors and other processing-related themes, which we will not cover here.
Linley Gwennap’s keynote: trends in AI acceleration
In his keynote, Linley Gwennap – principal analyst at TechInsights – noted that the growth of AI model size has slowed, as training has become increasingly resource-intensive: for example, training the GPT-3 language processing model takes 1,024 Nvidia A100 GPUs over one month. Rapid growth of AI model size has been enabled by moving training to large processing clusters, but cluster size is topping out for cost reasons: 1,024 GPUs cost approximately $25 million. As a result, essentially there has been no growth in largest trained models over the past year, and recent progress focuses on models with less compute per parameter. Future growth of AI model size will be paced by hardware progress, e.g. the availability of new Nvidia H100 clusters.