Unlocking the Potential of Deep Learning with Virtualized AI

  • A virtualized AI infrastructure for DL would run a single workload on multiple shared physical resources.

  • Article by Google’s Francois Chollet discusses the skill acquisition-based approach to gathering intelligence.

  • The conventional computing stack – from processor to firmware to virtualization, abstraction, orchestration and operating layers through to end-user software.


Is the recent progress in deep learning true artificial intelligence? A widely-discussed article by Google’s Francois Chollet discusses the skill acquisition-based approach to gathering intelligence – the one currently in use in modern DL. He argues that with huge data sets available for training models, AI is mastering skill-acquisition but not necessarily the “scope, generalization difficulty, priors, and experience” that true AI should incorporate. Even with our progress in AI, and specifically DL, we are nowhere near the limits of what DL can achieve with bigger, better-trained more accurate models, those that take into account not only skill but experience, and generalization of that experience.

Understandably, this has put intense focus on computing power, particularly the hardware that enables data scientists to run complex training experiments.

 

Nvidia increasing sees DL as a key market for its GPUs and bought Mellanox to speed communication inside a GPU cluster. With its recent acquisition of Habana, Intel is likely betting that custom AI accelerator hardware is a better match.


Other AI-first hardware includes Cerebras’s massive chip in a custom box that’s designed for the specific types of intensive, long-running workloads that training DL models require. In the cloud, Google’s Tensor Processing Units offer another bespoke option.

For companies running their own DL workloads, more compute is generally better. Whether exotic AI accelerators or tried-and-tested GPUs, quicker model training means more iterations, faster innovation and reduced time-to-market. It may even mean we can achieve “strong” AI (i.e., AI than goes beyond “narrow AI,” which is the capability of doing a single, discrete task) quicker.

 

In 2020, continuing the trend of recent years, companies will invest in ever-more AI hardware, in an effort to satisfy data scientists’ demands for compute to run bigger models to solve more complex business problems. But hardware isn’t the whole picture.


The conventional computing stack – from processor to firmware to virtualization, abstraction, orchestration and operating layers through to end-user software – was designed for traditional workloads, prioritizing high-availability, short-duration operations.Training a DL model, though, is the opposite of this sort of workload. While running a model, an experiment may need 100 percent of all the computing power of one or multiple processors for hours or even days at a time.

Read more: Virtualization can transform your company’s IT infrastructure

Spotlight

Other News

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More

Dom Nicastro | April 03, 2020

Read More

Spotlight

Resources