Batch Normalization is applied to layers of a Neural Networks where the output from the activation function is scaled

Here are the steps for BatchNorm

  • m - mean
  • s - standard deviation
  • g - arbitrary parameter
  • b - arbitrary parameter

m, s, g, and b are all trainable parameters so they’ll improve with more training