Kullback-Leibler (KL) Divergence is a type of statistical distance, it’s a measure of how much an approximating distribution
In other words, it quantifies the difference between what our model believes and what’s actually true
Discrete cases:
Continuous cases: