• Intuition: Measures the average squared difference.
  • Characteristics:
    • Produces a quadratic penalty → larger errors are penalized much more strongly.
    • Very sensitive to outliers (a single large error can dominate).
    • Differentiable everywhere, making it mathematically convenient for gradient-based optimization.