Residual
- Difference between observed and predicted values
Sum of the squared residuals
=
Watched the stat quest video: https://youtu.be/IN2XmBhILt4?si=XKd8oO5JcQMqyD1t
Main Idea
When a parameter is unknown in Backpropagation
We use the chain rule to calculate the derivative of the Sum of the Squared Residuals with the respect to the unknown parameters.
We initialize the unknown parameter with a number and use Gradient Descent to optimize the unknown parameter