Machine learning is just one part of AI, although it has a massive subset of algorithms within it. One method that you hear frequently today is “deep learning,” an algorithm that has received a fair share of attention in the news in recent years. To understand its popularity and success, it’s helpful to understand how it works. Deep learning is an evolution of a machine learning algorithm that was popular in the 1980s that you may recognize: neural networks.
Neural networks—a programming paradigm in which we train machines to “learn”—are inspired by neurons, or specialized cells in the human body that form the foundation of our nervous system, and brains in particular. These cells transmit signals throughout our bodies trigger nervous system responses and processes. Neurons are what enable us to see, hear, smell, etc.
In part one of this blog series, we discussed the basic process of human intelligence: input on the left, and output on the right. The neuron (pictured above) plays a critical role in this. On the left side of the neuron, the cell body collects “input.” Once it receives enough input or stimulation, the axon fires, transmitting the information to the right side—the synapse. The “output” is then sent to other neurons.
At any given moment, our neurons are passing messages between each other. These cells are responsible for our ability to perceive our surroundings. And when we learn, our neurons become very active. In fact, much of what we think of as human learning can be described by how strong the connection between two neurons in our brain is, along with the strength of the firing of our synapses.
A neural network is a mathematical simulation of a collection of neuron cells. The image below represents a basic neural network with 3 layers and 12 nodes. Each circular node represents an artificial, biologically-inspired “neuron.” The lines represent a connection from the output of one artificial neuron on the left to the input of another on the right. Signals between these neurons flow along the lines from left to right. In these networks, input—such as pixel data— flows from the input layer, through the middle “hidden” layers, and ultimately to the output layer by mathematical equations that are loosely inspired by the electrical activity in actual biological neurons.
Neural networks learn by trying to match data sets presented to the input layer to desired outcomes in the output layer. The mathematical equations compute the outputs, compare the simulated output to the desired outcome, and the resulting differences then produce tweaks to the strength of the connections. These tweaks are iteratively modified until the computed output is close enough to the desired outcome, at which point we say the neural network has “learned.”
As an example, imagine that we want a neural network to learn to identify cats and dogs from a set of images. A possible neural network would send the computer image pixels to the input layer, and then map the first output node to “cat”, the second output node to “dog”, and the third output node to “other.” Over time, as we show the neural network more and more examples of cats and dogs, along with the right answers (the dataset “labels”), the learning process will improve the accuracy of the neural network’s answers.
The neural network in the above example is very simple, made up of only a small number of nodes and layers of nodes. Deep learning occurs when the neural network becomes larger and more complex, like the image below.
These “deeper” neural networks can do much more complex predictions. There can be thousands of nodes and hundreds of layers, which means thousands of different calculations. Deep learning models have become very good at specific problems, such as speech or image recognition.
It’s worth noting, however, that deep learning is not a silver bullet for machine learning‒especially not in cybersecurity, where sometimes there is not the large volume of clean data that is ideal for deep learning methods. It is important to pick the right algorithm, data, and principles for the job. This is the best way for machines to gather evidence, connect the dots, and draw a conclusion.
Neural networks might seem like stuff of the future, but it’s been around for a while. In fact, neural networks are based on ideas that started circulating back in the 1940s. In our next blog, we’ll take a quick trip back in time to understand how neural networks and machine learning have come to permeate many parts of modern life.
Stephan Jou is Chief Technology Officer at Interset.