Residual - Difference between observed and predicted values

Sum of the squared residuals =

Watched the stat quest video: https://youtu.be/IN2XmBhILt4?si=XKd8oO5JcQMqyD1t

Main Idea

When a parameter is unknown in Backpropagation

We use the chain rule to calculate the derivative of the Sum of the Squared Residuals with the respect to the unknown parameters.

We initialize the unknown parameter with a number and use Gradient Descent to optimize the unknown parameter