Understanding backpropagation

Understand backpropagation

Neural network is the most important basis of Deep learning. It’s release has brought AI closer to us. It like human’s brain.

This is process calculate value for each node in the neural network. It’s have us to inferred output of neural network.

It’s a process of updating weights of neural network using optimize algorithms like Gradient Descent (SGD, RMS, …)
Here we will example backpropagation of a neural network like bellow, with:
- Loss: MSE
- activation at each node: sigmoid
- In order to update weight, Gradient Descent requires us to calc derivative of Loss by each weight, so can have:
So in order to calc derivative of loss by weight, we only need to calculate $\delta _{j}$
- Because derivative computation of loss of the output layer different from the hidden layer, so we have separately result:
  - With output layer:
  - With hidden layer: Asume that k is the layer next to j
    
    Implement backpropagation step by step
Initiate weights for neural network, randomly or by some methods
For each training example:
- Forward the input to calculate the value for nodes (hidden or output layer).
- For each output k (model have many output):
  $\delta _{k} = o * (1-o) * (t-o)$
- For each node in hidden layer (h is activation):
  $\delta _{j} = \sum_{k}^{} \delta _{k} * w_{kj} * h'(a _{j})$
- Update each network weight:
  $w _{ji} <- w _{ji} + \eta * \delta _{j} * z _{j}$