Deep Learning Key Terms, Explained
Gain a beginner's perspective on artificial neural networks and deep learning with this set of 14 straight-to-the-point related key concept definitions, including Biological Neuron, Multilayer Perceptron (MLP), Feedforward Neural Network, and Recurrent Neural Network.
Deep learning is a relatively new term, although it has existed prior to the dramatic uptick in online searches of late. Enjoying a surge in research and industry, due mainly to its incredible successes in a number of different areas, deep learning is the process of applying deep neural network technologies - that is, neural network architectures with multiple hidden layers - to solve problems. Deep learning is a process, like data mining, which employs deep neural network architectures, which are particular types of machine learning algorithms.
Deep learning has racked up an impressive collection of accomplishments of late. In light of this, it's important to keep a few things in mind, at least in my opinion:
- Deep learning is not a panacea - it is not an easy one-size-fits-all solution to every problem out there
- It is not the fabled master algorithm - deep learning will not displace all other machine learning algorithms and data science techniques, or, at the very least, it has not yet proven so
- Tempered expectations are necessary - while great strides have recently been made in all types of classification problems, notably computer vision and natural language processing, as well as reinforcement learning and other areas, contemporary deep learning does not scale to working on very complex problems such as "solve world peace"
- Deep learning and artificial intelligence are not synonymous
- Deep learning can provide an awful lot to data science in the form of additional processes and tools to help solve problems, and when observed in that light, deep learning is a very valuable addition to the data science landscape.
As shown in the image above, deep learning is to data mining as (deep) neural networks are to machine learning (process versus architecture). Also visible is the fact that deep neural networks are heavily involved in contemporary artificial intelligence, to the point that the 2 are so intertwined as to be bordering on synonymous (they are, however, not the same thing, and artificial intelligence has numerous other algorithms and techniques at its disposal beyond neural networks). Also note the connection between deep learning/deep neural networks and computer vision, natural language processing, and generative models, of particular importance given the great strides made in the recent past in these fields, driven by deep learning processes and neural network technologies.
So, with that, let's have a look at some deep learning related terminology, with a focus on concise, no-nonsense definitions.
As defined above, deep learning is the process of applying deep neural network technologies to solve problems. Deep neural networks are neural networks with one hidden layer minimum (see below). Like data mining, deep learning refers to a process, which employs deep neural network architectures, which are particular types of machine learning algorithms.
The machine learning architecture originally inspired by the biological brain (particularly the neuron) by which deep learning is carried out. Actually, artificial neural networks alone (the non-deep variety) have been around for a very long time, and have been able to solve certain types of problems historically. However, comparatively recently, neural network architectures were devised which included layers of hidden neurons (beyond simply the input and output layers), and this added level of complexity is what enables deep learning, and provides a more powerful set of problem-solving tools.
ANNs actually vary in their architectures quite considerably, and therefore there is no definitive neural network definition. The 2 generally-cited characteristics of all ANNs are the possession of adaptive weight sets, and the capability of approximating non-linear functions of the inputs to neurons.
Much is often made of the definitive connection between biological and artificial neural networks. Popular publications propagate the idea that ANNs are somehow an exact replica of what's going on in the human (or other biological) brain. This is clearly inaccurate; at best, early artificial neural networks were inspired by biology. The abstract relationship between the 2 are no more definitive than the abstract comparison drawn between the make up and functionality of atoms and the solar system.
That said, it does do us some good to see how biological neurons work at a very high level, if simply to understand the inspiration for ANNs.
The major components of the biological neuron of interest to us are:
- The nucleus holds genetic information (i.e. DNA)
- The cell body processes input activations and converts them to output activations
- Dendrites receive activations from other neurons
- Axons transmit activations to other neurons
- The axon endings, along with neighboring dendrites, form the synapses between neurons
Chemicals called neurotransmitters then diffuse across the synaptic cleft between an axon ending and a neighboring dendrite, constituting a neurotransmission. The essential operation of the neuron is that an activation flows into a neuron via a dendrite, is processed, and is then retransmitted out an axon, through its axon endings, where it crosses the synaptic cleft, and reaches a number of receiving neurons’ dendrites, where the process is repeated.
A perceptron is a simple linear binary classifier. Perceptrons take inputs and associated weights (representing relative input importance), and combine them to produce an output, which is then used for classification. Perceptrons have been around a long time, with early implementations dating back to the 1950s, the first of which were involved in early ANN implementations.
A multilayer perceptron (MLP) is the implementation of several fully adjacently-connected layers of perceptrons, forming a simple feedforward neural network (see below). This multilayer perceptron has the additional benefit of nonlinear activation functions, which single perceptrons do not possess.