What is a Neural Network?
Deep learning (DL) is a new research direction in the field of machine learning (ML). It is introduced into machine learning to make it closer to the original goal-artificial intelligence (AI). [1]
- Chinese name
- Deep learning
- Foreign name
- Deep Learning
- Presenter
- Geoffrey Hinton, Yoshua Bengio, Yann LeCun, etc.
- Presentation time
- year 2006
- Subject
- artificial intelligence
- Application
- Computer vision, natural language processing, bioinformatics, etc.
- Deep learning (DL) is a new research direction in the field of machine learning (ML). It is introduced into machine learning to make it closer to the original goal-artificial intelligence (AI). [1]
- Deep learning is to learn the inherent laws and representation levels of sample data. The information obtained during these learning processes is very helpful for the interpretation of data such as text, images and sounds. Its ultimate goal is to enable machines to analyze and learn like humans, and to recognize data such as text, images, and sound. Deep learning is a complex machine learning algorithm that achieves far more results in speech and image recognition than previous related technologies. [1]
- Deep learning has achieved many results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech, recommendation and personalization technologies, and other related fields. Deep learning enables machines to imitate human activities such as audiovisual and thinking, and solves many complex pattern recognition problems, which has made great progress in artificial intelligence-related technologies. [1]
Introduction to Deep Learning
- Deep learning is a general term for a class of pattern analysis methods. As far as specific research content is concerned, three types of methods are mainly involved: [2]
- (1) A neural network system based on convolution operations, that is, a convolutional neural network (CNN). [2]
- (2) Self-encoding neural networks based on multi-layer neurons, including two types of auto-encoding (Auto encoder) and sparse coding (Sparse Coding), which have received widespread attention in recent years. [2]
- (3) Pre-training in a multi-layer self-encoding neural network, and then combining the discriminant information to further optimize the deep confidence network (DBN) of the neural network weights. [2]
- Through multi-layer processing, the initial "low-level" feature representation is gradually transformed into "high-level" feature representation, and then "simple models" can be used to complete complex classification and other learning tasks. Therefore, deep learning can be understood as "feature learning" or "representation learning". [3]
- In the past, when machine learning was used for real-world tasks, the features describing samples were usually designed by human experts. This became "feature engineering". As we all know, the quality of features has a crucial impact on generalization performance. It is not easy for human experts to design good features; feature learning (representation learning) generates good features through machine learning technology itself, which makes machine learning "Fully automated data analysis" takes another step forward. [3]
- In recent years, researchers have gradually combined these types of methods, such as unsupervised pre-training of convolutional neural networks based on supervised learning combined with self-encoded neural networks, and then use the discriminative information to fine-tune network parameters to form Convolutional deep belief network. Compared with traditional learning methods, deep learning methods preset more model parameters, so model training is more difficult. According to the general law of statistical learning, it is known that the more model parameters, the larger the amount of data that needs to participate in training. [2]
- In the 1980s and 1990s, due to the limited computing power of computers and the limitations of related technologies, the amount of data available for analysis was too small. Deep learning did not show excellent recognition performance in pattern analysis. Since 2006, Hinton et al. Proposed the CD-K algorithm to quickly calculate the weights and deviations of restricted Boltzmann machine (RBM) networks, RBM has become a powerful tool for increasing the depth of neural networks, leading to the widespread use of DBN ( Developed by Hinton et al. And has been used by companies such as Microsoft in speech recognition) and other deep networks. At the same time, sparse coding is also used in deep learning because it can automatically extract features from data. Convolutional neural network methods based on local data regions have also been extensively studied this year. [2]
Deep learning definition
- Deep learning is a type of machine learning, and machine learning is the necessary path to achieve artificial intelligence. The concept of deep learning is derived from the study of artificial neural networks. Multilayer perceptrons with multiple hidden layers are a type of deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data. The motivation for researching deep learning is to build a neural network that simulates the human brain for analysis and learning. It mimics the mechanism of the human brain to interpret data, such as images, sounds, and text. [4]
- Deep learning model with multiple hidden layers
- The calculation involved in generating an output from an input can be represented by a flow graph: a flow graph is a graph that can represent calculations, where each node represents a basic calculation and a calculation The value of the calculation result is applied to the value of the children of this node. Consider such a computational set, which can be allowed in each node and possible graph structure, and defines a family of functions. The input node has no parent and the output node has no children. [4]
- One particular property of this flow graph is depth: the length of the longest path from one input to one output. [4]
- Traditional feed-forward neural networks can be viewed as having a depth equal to the number of layers (for example, the number of hidden layers plus 1 for the output layer). SVMs have a depth of 2 (one corresponding to the kernel output or feature space, and the other corresponding to a linear blend of the resulting output). [4]
- One of the research directions of artificial intelligence is represented by the so-called "expert system", which is defined by a large number of "If-Then" rules, a top-down mentality. Artificial Neural Network (Artificial Neural Network) marks another kind of bottom-up thinking. Neural networks do not have a strict formal definition. Its basic feature is to try to imitate the mode of transmission and processing of information between neurons in the brain. [4]
Deep learning characteristics
- Different from traditional shallow learning, deep learning is different in: [4]
- (1) The depth of the model structure is emphasized, and there are usually 5 layers, 6 layers, and even 10 layers of hidden nodes; [4]
- (2) The importance of feature learning is clarified. In other words, the feature representation of the sample in the original space is transformed to a new feature space through the feature transformation layer by layer, thereby making classification or prediction easier. Compared with the method of constructing features by artificial rules, using big data to learn features can better describe the rich internal information of the data. [4]
- Through the design and establishment of a suitable number of neuron computing nodes and multi-level computing hierarchies, the appropriate input and output layers are selected, and the function relationship from input to output is established through network learning and tuning. A functional relationship with the output, but can be as close as possible to the actual relationship. Using trained network models, we can achieve our automated requirements for complex transactions. [4]
Typical model of deep learning and deep learning
- Typical deep learning models include convolutional neural network, DBN, and stacked auto-encoder network models, etc. These models are described below. [5]
Convolutional neural network model for deep learning
- Before the emergence of unsupervised pre-training, training deep neural networks was often very difficult, and one special case was convolutional neural networks. Convolutional neural network
Deep learning deep trust network model
- DBN can be interpreted as a Bayesian probability generation model. It consists of multiple layers of random hidden variables. The upper two layers have undirected symmetrical connections. The lower layers get top-down directed connections from the previous layer. Is the state of the visible input data vector. The DBN is composed of a stack of 2F structural units, and the structural unit is usually a RBM (RestIlcted Boltzmann Machine, Restricted Boltzmann Machine). The number of neurons in the visible layer of each RBM unit in the stack is equal to the number of neurons in the hidden layer of the previous RBM unit. According to the deep learning mechanism, the input layer is used to train the first layer RBM unit, and the output is used to train the second layer RBM model. The RBM model is stacked to increase the model performance by adding layers. In the unsupervised pre-training process, after the DBN encoding is input to the top-level RBM, the state of the top-level is decoded to the lowest-level unit to realize the reconstruction of the input. As a structural unit of DBN, RBM shares parameters with each layer of DBN. [5]
Deep learning stack self-coding network model
- The structure of the stack self-encoding network is similar to that of DBN. It consists of several structural unit stacks, except that its structural unit is an auto-en-coder instead of RBM. The self-encoding model is a two-layer neural network. The first layer is called the encoding layer and the second layer is called the decoding layer. [5]
Deep learning deep learning training process
- In 2006, Hinton proposed an effective method for building multi-layer neural networks on unsupervised data, which is divided into two steps: first, build a single layer of neurons layer by layer, so that each time a single layer network is trained; when all layers After training, use the wake-sleep algorithm for tuning. [6]
- The weights between the layers other than the top layer are bidirectional, so that the top layer is still a single-layer neural network, and the other layers become graph models. Upward weights are used for "cognition" and downward weights are used for "generation". Then use the wake-sleep algorithm to adjust all the weights. Make cognition and generation agree, which is to ensure that the topmost representation of the generation can restore the underlying nodes as accurately as possible. For example, a node on the top level represents a human face, then the image of all faces should activate this node, and the image generated by this result should be able to represent an approximate face image. The wake-sleep algorithm is divided into two parts: wake and sleep. [6]
- Wake phase: the cognitive process, which generates an abstract representation of each layer through external features and upward weights, and uses gradient descent to modify the downward weights between layers. [6]
- The sleep phase: the generation process, through the top-level representation and downward weights, generates the bottom state, while modifying the upward weights between layers. [6]
Deep learning bottom-up unsupervised learning
- It starts from the bottom and trains to the top layer by layer. Using uncalibrated data (also with calibration data) to train the parameters of each layer hierarchically. This step can be regarded as an unsupervised training process, which is also the part that is most different from traditional neural networks. Specifically, the first layer is trained with uncalibrated data, and the parameters of the first layer are learned first during training. This layer can be regarded as a hidden layer of a three-layer neural network that minimizes the difference between the output and the input. Restrictions and sparseness constraints enable the resulting model to learn the structure of the data itself, thereby obtaining features that are more representative than the input; after learning to obtain the nl layer, use the output of the nl layer as the input of the nth layer, n layers, from which the parameters of each layer are obtained. [6]
Top-down supervised learning for deep learning
- It is to train with labeled data, the error is transmitted from top to bottom, and the network is fine-tuned. Based on the parameters of each layer obtained in the first step, the parameters of a multi-layer model are further optimized. This step is a supervised training process. The first step is similar to the initial initialization process of a neural network. Since the first step is not random initialization, it is obtained by learning the structure of the input data. Therefore, this initial value is closer to the global optimum, which can achieve better results. So the good effect of deep learning is largely due to the feature learning process in the first step. [6]
Deep learning applications
Deep learning computer vision
- The Multimedia Lab of the Chinese University of Hong Kong is the first Chinese team to apply deep learning to computer vision research. In the world-class artificial intelligence competition LFW (Large-scale Face Recognition Competition), the laboratory once beat FaceBook to win the championship, making artificial intelligence's recognition ability in this field surpasses real people for the first time. [7]
Deep learning speech recognition
- By working with Hinton, Microsoft researchers first introduced RBM and DBN into the speech recognition acoustic model training, and achieved great success in large vocabulary speech recognition systems, reducing the speech recognition error rate by 30%. However, DNN does not yet have efficient parallel fast algorithms. Many research institutions are using large-scale data corpora to improve the training efficiency of DNN acoustic models through the GPU platform. [8]
- Internationally, companies such as IBM and Google have quickly conducted research on DNN speech recognition, and the speed is fast. [8]
- Domestically, companies such as Alibaba, HKUST Xunfei, Baidu, and the Institute of Automation of the Chinese Academy of Sciences are also conducting research on deep learning in speech recognition. [8]
Other areas such as deep learning natural language processing
- Many institutions are conducting research. In 2013, Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean published a paper Efficient Estimation of Word Representations in Vector Space to establish a word2vector model. Compared with the traditional bag of words model, word2vector can Better express grammatical information. Deep learning is mainly used in machine translation and semantic mining in the fields of natural language processing. [9]