EDACafe Weekly Review May 25th, 2018

Intel’s Gadi Singer believes his most important challenge is his latest: using artificial intelligence (AI) to reshape scientific exploration.

In a Q&A timed with the first Intel AI DevCon event, the

Gadi Singer

Gadi Singer, vice president and architecture general manager for the Artificial Intelligence Products Group at Intel, uses artificial intelligence to reshape scientific exploration. Before his role with AI, the 35-year Intel veteran helped create the first Pentium processor; led development of the first Xeon processors and the first Atom processor; and oversaw architecture for generations of the Intel Core processors. (Photo Credit: Walden Kirsch/Intel Corporation)

Intel vice president and architecture general manager for its Artificial Intelligence Products Group discussed his role at the intersection of science — computing’s most demanding customer — and AI, how scientists should approach AI and why it is the most dynamic and exciting opportunity he has faced.

Q. How is AI changing science?

Scientific exploration is going through a transition that, in the last 100 years, might only be compared to what happened in the ‘50s and ‘60s, moving to data and large data systems. In the ‘60s, the amount of data being gathered was so large that the frontrunners were not those with the finest instruments, but rather those able to analyze the data that was gathered in any scientific area, whether it was climate, seismology, biology, pharmaceuticals, the exploration of new medicine, and so on.

Today, the data has gone to levels far exceeding the abilities of people to ask particular queries or look for particular insights. The combination of this data deluge with modern computing and deep learning techniques is providing new and many times more disruptive capabilities.

Q. What’s an example?

One of them, which uses the basic strength of deep learning, is the identification of very faint patterns within a very noisy dataset, and even in the absence of an exact mathematical model of what you’re looking for.

Think about cosmic events happening in a far galaxy, and you’re looking for some characteristics of the phenomena to spot them out of a very large dataset. This is an instance of searching without a known equation, where you are able to give examples, and through them, let the deep learning system learn what to look for and ultimately find out a particular pattern.

Q. So you know what you’re looking for but you don’t know how to find it?

You can’t define the exact mathematical equation or the queries that describe it. The data is too large for trial-and-error and previous big-data analytics techniques do not have enough defined features to successfully search for the pattern.

You know what you’re looking for because you tagged several examples of it in your data, and you can generally describe it. Deep learning can help you spot occurrences from such a class within a noisy multidimensional dataset.

Q. Are there other ways AI can change the scientific approach?

Another example is when you do have a mathematical model, like a set of accurate equations. In this case you can use AI to achieve comparable results in 10,000 times less time and computing.

Say you have a new molecular structure and you want to know how it’s going to behave in some environment for pharma exploration. There are very good predictive models on how it will behave. The problem is that those models take a tremendous amount of computation and time — it might take you weeks to try just one combination.

More: Intel AI VP Gadi Singer on One Song to the Tune of Another (The Next Platform) | Intel AI DevCon (Press Kit) | Artificial Intelligence at Intel (Press Kit) | More Intel Explainers

In such a case, you can use a deep learning system to shadow the accurate system of equations. You iteratively feed sample cases to this system of equations, and you get the results days later. The deep learning network learns the relationship between the input and the output, without knowing the equation itself. It just tracks it. It was demonstrated in multiple cases that, after you train the deep learning system with enough examples, it shows excellent ability to predict the result that will be given by the exact model. This translates to an efficiency that could turn hours or days into second.

Granted, sometimes the full computation will be required for ultimate model accuracy. However, that would only be needed for a small subset of cases. The fact that you can generate an accurate result so much faster with a fraction of the power and the time allows you to explore the potential solution space much faster.

In the last couple years, new machine learning methods have emerged for “learning how to learn.” These technologies are tackling an almost-endless realm of options — like all the possible mutations in human DNA — and are using exploration and meta-learning techniques to identify the most relevant options to evaluate.

“Scientists need to partner with AI…to explore and investigate which new possibilities have the best likelihood of breakthroughs and new solutions.”
— Gadi Singer, vice president and architecture general manager of Intel’s Artificial Intelligence Products Group

Q. What’s the big-picture impact to the scientific method or just the approach that a scientist would take with AI?

Scientists need to partner with AI. They can greatly benefit from mastering the tools of AI, such as deep learning and others, in order to explore phenomena that are less defined, or when they need faster performance by orders of magnitude to address a large space. Scientists can partner with machine learning to explore and investigate which new possibilities have the best likelihood of breakthroughs and new solutions.

Q. I’m guessing you could retire if you wanted to. What keeps you going now?

Well, I’m having a great time. AI at Intel today is about solving the most exciting and most challenging problems the industry and science are facing. This is an area that moves faster than anything I’ve seen in my 35 years at Intel, by far.

The other aspect is that I’m looking at it as a change that is brewing in the interaction between humans and machines. I want to be part of the effort of creating this new link. When I talk about partnership of science and AI, or autonomous vehicles and other areas, there’s a role here for a broader thinking than just how to give the fastest processor for the task. This newly forged interaction between people and AI is another fascinating part of this space.

What’s New: Intel collaborates with Novartis* on the use of deep neural networks (DNN) to accelerate high content screening – a key element of early drug discovery. The collaboration team cut time to train image analysis models from 11 hours to 31 minutes – an improvement of greater than 20 times1.

Collaboration team members from Novartis and Intel used eight CPU-based servers, a high-speed fabric interconnect and optimized TensorFlow to achieve the improvement in time needed to process a dataset of 10K images.

Why It’s Important: High content screening of cellular phenotypes is a fundamental tool supporting early drug discovery. The term “high content” signifies the rich set of thousands of pre-defined features (such as size, shape, texture) that are extracted from images using classical image-processing techniques. High content screening allows analysis of microscopic images to study the effects of thousands of genetic or chemical treatments on different cell cultures.

The promise of deep learning is that relevant image features that can distinguish one treatment from another are “automatically” learned from the data. By applying deep neural network acceleration, biologists and data scientists at Intel and Novartis hope to speed up the analysis of high content imaging screens. In this joint work, the team is focusing on whole microscopy images as opposed to using a separate process to identify each cell in an image first. Whole microscopy images can be much larger than those typically found in deep learning datasets. For example, the images used in this evaluation are more than 26 times larger than images typically used from the well-known ImageNet* dataset of animals, objects and scenes.

Deep convolutional neural network models, for analyzing microscopy images, typically work on millions of pixels per image, millions of parameters in the model and possibly thousands of training images at a time. That constitutes a high computational load. Even with advanced computational capabilities on existing computing infrastructure, deeper exploration of DNN models can be prohibitive in terms of time.

To solve these challenges, the collaboration is applying deep neural network acceleration techniques to process multiple images in significantly less time while extracting greater insight from image features that the model ultimately learns.

What It Looks Like: The collaboration team with representatives from Novartis and Intel have shown more than 20 times1 improvement in the time to process a dataset of 10K images for training. Using the Broad Bioimage Benchmark Collection* 021 (BBBC-021) dataset, the team has achieved a total processing time of 31 minutes with over 99 percent accuracy.

For this result, the team used eight CPU-based servers, a high-speed fabric interconnect, and optimized TensorFlow1.  By exploiting the fundamental principle of data parallelism in deep learning training and the ability to fully utilize the benefits of large memory support on the server platform, the team was able to scale to more than 120 3.9-megapixel images per second with 32 TensorFlow* workers.

What’s Next: While supervised deep learning methods are essential to accelerating image classification and speeding time to insight, deep learning methods depend on large expert-labeled datasets to train the models. The time and manual effort necessary to create such datasets is often prohibitive. Unsupervised deep learning methods – that may be applied to unlabeled microscopy images – hold the promise of revealing novel insights for cellular biology and ultimately drug discovery. This will be the focus of continuing efforts in the future.

More Context: Artificial Intelligence at Intel | Advancing Data-Driven Healthcare Solutions | 2018 Intel DevCon (press Kit)

The Fine Print:

[1] 20x claim based on 21.7x speed up achieved by scaling from single node system to 8-socket cluster.

8-socket cluster node configuration :

CPU: Intel® Xeon® 6148 Processor @ 2.4GHz
Cores: 40
Sockets: 2
Hyper-threading: Enabled
Memory/node: 192GB, 2666MHz
NIC: Intel® Omni-Path Host Fabric Interface (Intel® OP HFI)
TensorFlow: v1.7.0
Horovod: 0.12.1
OpenMPI: 3.0.0
Cluster: ToR Switch: Intel® Omni-Path Switch

Single node configuration:

CPU: Intel® Xeon® Phi Processor 7290F
192GB DDR4 RAM
1x 1.6TB Intel® SSD DC S3610 Series SC2BX016T4
1x 480GB Intel® SSD DC S3520 Series SC2BB480G7
Intel® MKL 2017/DAAL/Intel Caffe

*References:

BBBC-021: Ljosa V, Sokolnicki KL, Carpenter AE, Annotated high-throughput microscopy image sets for validation, Nature Methods, 2012

ImageNet: Russakovsky O et al, ImageNet Large Scale Visual Recognition Challenge, IJCV, 2015

Tensorflow: Abadi M et al, Large-Scale Machine Learning on Heterogeneous Systems, Software available from tensorflow.org

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase.  For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.

Intel, the Intel logo, and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

IC Insights raises its full-year spending growth forecast for this year from 8% to 14%.
IC Insights recently released its May Update to the 2018 McClean Report.  This Update included a look at the top-25 1Q18 semiconductor suppliers, a discussion of the 1Q18 IC industry market results, and an update of the 2018 capital spending forecast by company.

Overall, the capital spending story for 2018 is becoming much more positive as compared with the forecast presented in IC Insights’ March Update to The McClean Report 2018 (MR18).  In the March Update, IC Insights forecast an 8% increase in semiconductor industry capital spending for this year. However, as shown in Figure 1, IC Insights has raised its expectations for 2018 capital spending by six percentage points to a 14% increase.  If this increase occurs, it would be the first time that semiconductor industry capital outlays exceeded $100 billion.  The worldwide 2018 capital spending forecast figure is 53% higher than the spending just two years earlier in 2016.

Although Samsung says it still does not have a full-year capital spending forecast for this year it did say it will spend “less” in semiconductor capital outlays in 2018 as compared to 2017, when it spent $24.2 billion.  However, as of 1Q18, with regard to its capex, its “foot is still on the gas!”  Samsung spent $6.72 billion in capex for its semiconductor division in 1Q18, slightly higher than the average of the previous three quarters.  This figure is almost 4x the amount the company spent just two years earlier in 1Q16!  Over the past four quarters, Samsung has spent an incredible $26.6 billion in capital outlays for its semiconductor group. Wow!

IC Insights has estimated Samsung’s semiconductor group capital spending will be $20.0 billion this year, $4.2 billion less than it spent in 2017.  However, given the strong start to its spending this year, it appears there is currently more upside than downside potential to this forecast.

With the DRAM and NAND flash memory markets still very strong, SK Hynix is expected to ramp up its capital spending this year to $11.5 billion, 42% greater than the $8.1 billion it spent in 2017. The increased spending by SK Hynix this year will primarily focus on bringing on-line two large memory fabs—M15, a 3D NAND flash fab in Cheongju, South Korea and its expansion of its huge DRAM fab in Wuxi, China.  The Cheongju fab is being pushed to open before the end of this year.  The Wuxi fab is also targeted to open by the end of this year, a few months earlier than its original planned start date of early 2019.


Figure 1

Report Details: The 2018 McClean Report
Additional details on current IC market trends are provided in the May Update  to The McClean Report—A Complete Analysis and Forecast of the Integrated Circuit Industry. A subscription to The McClean Report includes free monthly updates from March through November (including a 250+ page Mid-Year Update, and free access to subscriber-only webinars throughout the year. An individual-user license to the 2018 edition of The McClean Report is priced at $4,290 and includes an Internet access password.  A multi-user worldwide corporate license is available for $7,290.

To review additional information about IC Insights’ new and existing market research reports and services please visit our website: www.icinsights.com.
More Information Contact
For more information regarding this Research Bulletin, please contact Bill McClean, President at IC Insights. Phone: +1-480-348-1133, email: bill@icinsights.com
PDF Version of This Bulletin
A PDF version of this Research Bulletin can be downloaded from our website at http://www.icinsights.com/news/bulletins/
DownStream: Solutions for Post Processing PCB Designs
TrueCircuits: IoTPLL
DAC2018


You are registered as: [_EMAIL_].

CafeNews is a service for EDA professionals. EDACafe.com respects your online time and Internet privacy. Edit or Change my newsletter's profile details. Unsubscribe me from this newsletter.

Copyright © 2018, Internet Business Systems, Inc. — 25 North 14th Steet, Suite 710 San Jose, CA 95112 — +1 (408) 882-6554 — All rights reserved.