- Intuition: Measures the average squared difference.
- Characteristics:
- Produces a quadratic penalty → larger errors are penalized much more strongly.
- Very sensitive to outliers (a single large error can dominate).
- Differentiable everywhere, making it mathematically convenient for gradient-based optimization.