Surrogate Loss

Surrogate Loss is a loss function that is easier to optimize than the true objective we actually care about.

The true objective may be non-differentiable, expensive to evaluate, unstable to optimize directly, or only observable through samples. A surrogate loss gives the model a practical training signal that points in roughly the right direction.

Good surrogate losses

A useful surrogate loss should be:

differentiable or sub-differentiable
cheap enough to compute during training
correlated with the true objective
stable under optimization
hard to exploit in unintended ways

The last point matters because models optimize exactly what the loss rewards, not what we meant the loss to represent.

Failure mode

A surrogate can fail when improving the proxy no longer improves the real objective.

Examples:

lower cross entropy does not always mean better calibrated or more useful predictions
better imitation loss does not always mean better task success
higher reward model score does not always mean better human preference
larger policy update may improve the surrogate but damage actual rollout performance

This is called optimizing the proxy instead of the real goal.

Ayush Garg

Recently Updated

Pareto Principle

Bits

Magnitude of a normalized floating-point number

Mixed Precision Training

Surrogate Loss

Good surrogate losses

Failure mode

Graph View

Table of Contents

Backlinks

Ayush Garg

Recently Updated

Pareto Principle

Bits

Magnitude of a normalized floating-point number

Mixed Precision Training

Surrogate Loss

Good surrogate losses §

Failure mode §

Graph View

Table of Contents

Backlinks

Good surrogate losses

Failure mode