Deep Learning Essentials
上QQ阅读APP看书,第一时间看更新

Impact of deep learning

To show you some of the impacts of deep learning, let’s take a look at two specific areas: image recognition and speed recognition.

The following figure, Performance on ImageNet classification over time, shows the top five error rate trends for ILSVRC contest winners over the past several years. Traditional image recognition approaches employ hand-crafted computer vision classifiers trained on a number of instances of each object class, for example, SIFT + Fisher vector. In 2012, deep learning entered this competition. Alex Krizhevsky and Professor Hinton from Toronto university stunned the field with around 10% drop in the error rate by their deep convolutional neural network (AlexNet). Since then, the leaderboard has been occupied by this type of method and its variations. By 2015, the error rate had dropped below human testers:

Performance on ImageNet classification over time

The following figure, Speech recognition progress depicts recent progress in the area of speech recognition. From 2000-2009, there was very little progress. Since 2009, the involvement of deep learning, large datasets, and fast computing has significantly boosted development. In 2016, a major breakthrough was made by a team of researchers and engineers in Microsoft Research AI (MSR AI). They reported a speech recognition system that made the same or fewer errors than professional transcriptionists, with a word error rate (WER) of 5.9%. In other words, the technology could recognize words in a conversation as well as a person does:

Speech recognition progress

A natural question to ask is, what are the advantages of deep learning over traditional approaches? Topology defines functionality. But why do we need expensive deep architecture? Is this really necessary? What are we trying to achieve here? It turns out that there are both theoretical and empirical pieces of evidence in favor of multiple levels of representation. In the next section, let’s dive into more details about the deep architecture of deep learning.