By Dan Kellett, Director of Data Science, Capital One UK
What are Neural Networks?
Neural Networks are a family of Machine Learning techniques modelled on the human brain. Being able to extract hidden patterns within data is a key ability for any Data Scientist and Neural Network approaches may be especially useful for extracting patterns from images, video or speech. The following blog aims to explain at a high level how these methods work and key things to bear in mind.
The structure of a neural network
The network consists of different components:
– Input layer: this reflects the potential descriptive factors that may help in prediction.
– Hidden layer: a user-defined number of layers with a specified number of neurons in each layer.
– Output layer: this reflects the thing you are trying to predict. For example; this could be a labelling of an image or a more traditional 0/1 outcome
– Weights: each neuron in a given layer is potentially connected to every neuron in adjacent layers – the weight sets the importance of this link. At first these weights should be randomized.
Training the network
In a basic neural network, you train the system by running individual cases through one at a time and updating the weights based on the error. The aim is that over time the networks should become attuned to your data, minimizing error. This updating of weights in a basic neural network is an output of a two-way process using feed-forward and back-propagation techniques:
Feed-forward involves processing observations one-at-a-time through the network. Given the weights in place the model should produce a prediction and from this prediction and the actual outcome you can calculate the error in your model for that one observation.
Back-propagation involves taking that error back through the network to adjust the individual weights to better reflect the actual outcome. These new weights are then used for the next observation.
This is the basic process for tuning a neural network however there some key factors to consider when fitting a model to your data:
Learning rate: You do not want your back-propagation of weights to be too sensitive as your model does not want to change dramatically due to one specific observation. Equally you do not want your model to be unreactive to new data. As such it is important to set an appropriate learning rate in the model optimization.
Topology: The design of the network (number of hidden layers and number of neurons in each layer) may vary depending on what specific problem you are looking to solve and which branch of the Neural Network family of approaches you are using. Broadly you may want to increase the number of neurons through the model if you want to have more features or decrease if you want to get useful abstract features only. This becomes even more important if you are using more complex techniques such as Deep Learning.
Dropout rate: Neural networks are known to suffer from over-fitting and can also be slow to train (though they are scalable and map-reduce training algorithms are available). With Drop-out some neurons and all their connections are dropped out of the network with a given probability. The resulting training amounts to an ensemble of neural networks and has demonstrated improved performance of the approach
When would I use Neural Networks?
As with any technique related to data science Neural Networks are one family of many approaches you could take to solve a business problem using large amounts of data. The technique is very processor-intensive producing results that may be hard to interpret. If you are willing to sacrifice interpretability for power Neural Networks may be a useful technique. This is especially the case for solving problems the human brain is very good at such as text, image or voice recognition.
Credits: Shahriar Asta, Merlijn van Horssen