Softplus function is a smooth version of the ReLU activation function
It is commonly used as an Activation Function in Neural Networks.
Intuition
Softplus behaves like a smoothed-out version of ReLU:
For large positive values of
For large negative values of
So Softplus is close to
Derivative
The derivative of Softplus is the sigmoid function:
This matters because Softplus is differentiable everywhere, unlike ReLU, which has a corner at
Properties
- Domain: all real numbers
- Range:
- Smooth and differentiable everywhere
- Always positive
- Approaches
as - Approaches
as
Why Use It
Softplus can be useful when you want the behavior of ReLU but need a smooth function. This can make optimization easier in cases where sharp corners cause issues.
One tradeoff is that Softplus is more expensive to compute than ReLU because it uses an exponential and logarithm.
Numerically Stable Form
For large