An activation function takes the weighted sum of the inputs (which includes the bias term) and transforms it into an output signal passed onto the next layer
The reason for doing this is because an activation function introduces non-linearity which allows the network to learn and model complex relationships in data
Activation functions apply a nonlinear transformation and decide whether a neuron should be activated or not