Forward propagation is the starting process of training. It is really important to know how this process works when we build an AI model. In this detailed guide, I will give you a solid understanding of forward propagation and its steps, and explain how those steps work with practical examples.
What is forward propagation?
Forward propagation is the initial stage in neural network training. In forward propagation, input data are flowed layer by layer through the neural network in the forward direction to produce the output. Finally, the average loss value is calculated for all the training data using a loss function.
Steps of forward propagation
Output producing and loss calculation for a given data are the main steps in The Forward Propagation process. But for better understanding, these steps can be divided like this:
- Neural computation and Activation
- Loss value calculation
- Repeat the above steps
- Average loss value calculation
For example, imagine we have a 4-by-4 (16)pixel size 6 black & white photos of the handwritten numbers 0 and 1. Each number has three data examples. Our goal is to build a simple feedforward (MLP) neural network to identify a given image whether it belongs to the number 0 or 1.
For that, we use a Feed Forward neural network with an input layer, two hidden layers, and an output layer. The Input layer has 16 neurons(each neuron gets a one-pixel value), Hidden layers each with 6 neurons, and the output layer with 2 neurons(representing numbers 0 and 1).
As activation functions we use the Relu activation function in the hidden layers and the softmax activation function in the output layer. So the corresponding number of the output neuron with the highest probability is considered as the neural network’s prediction for that given image.
Initial weight and bias values for each neuron:
Input Data(Images):
Black & White image pixel values are in the range between 0 – 255. 0 means black and 255 means white.
Neural computation and Activation
Here, Data is fed to the input layer, and then it passes through the hidden layers to the output layer.
In the Input layer, the fed data is changed into numerical representation to make it easy to do mathematical operations on it using normalization methods (e.g. 151/250 = 0.5921). Then, those values are passed to the hidden layer to do multiplication, bais addition, and apply activation functions.
If you’re unfamiliar with mathematical computations and how neural networks make predictions, you can refer to our comprehensive guide on Artificial Neural Networks (ANN) for a deeper understanding.
(Like that, input neurons pass values to all the neurons of the hidden layer 1. Then, the hidden layer 1 neurons do mathematical computations on given values)
Then, the outputs of the hidden layer 1 neurons are passed to each neuron in the hidden layer 2. Neurons of the hidden layer 2 also do multiplications, additions, and activations to those values(outputs of the hidden layer 1).
After that, outputs of the hidden layer 2 are passed to each neuron of the output layer where the last result is produced. Each neuron of the output layer takes the outputs of hidden layer 2 as inputs. Then, they do multiplication, addition, and activation to that data. Neurons in the output layer use the special activation function(Here we use the Softmax activation function) to produce the last result as needed.
For better understanding let’s imagine our AI model produced an output like this for the first input data(image):
Here you can see our model gives the result as there is a 0.26 chance to have this image being a number 0 and a 0.74 chance to be a number 1. Assume that output neuron 1 represents the number 0 and output neuron 2 represents the number 1. So AI model tells us that it is an image of the number 1(which is the wrong answer).
Next, we should determine how much our AI model is wrong, For that we use loss value.
Loss value calculation
Our AI model’s predictions are not accurate so we have to adjust model parameters to increase accuracy. For that, First of all, we have to measure the difference between the predicted value and the true value(loss value).
Here for calculating the loss value we are using the Mean Squared Erro(MSE) loss function. It calculates the squared difference between the predicted and expected outputs of each neuron in each data(images) and sums these squared values of each neuron then divides the summed value by the number of output neurons.
Above we just get the MSE loss value of one data sample. We can use this MSE value to update the model’s parameters as giving the correct answer only for that data sample. But to update our model parameters as model prediction is accurate for all the data samples, we have to calculate the MSE value for other data samples too.
Repeat the above steps
We should apply the same steps we used above for other data samples to calculate the MSE loss value for each data sample:
Average loss value calculation
Now we can calculate a loss value that represents how bad our AI model is for all data samples. In other words Average loss value of our model. As we use the MSE loss function we can say this is the Avg MSE loss value.
This Average loss value helps us to get an idea of how we should adjust our model parameters to increase the prediction accuracy as the effect for all the training data samples. In the next training stage, Backpropagation, the average loss value is used to update the AI model parameters.
Proper Understanding of forward propagation is essential for working with neural networks. Through this comprehensive guide, we’ve dived into the steps involved in forward propagation, where inputs are transformed into predictions. By following the outlined principles, both beginners and enthusiasts can enhance their comprehension of neural network operations, Making it easier for more discoveries and uses in the amazing world of artificial intelligence.