() =

  • = Softmax
  • = input vector
  • e = standard exponential function for input vector
  • K = number of classes in multi-class classifier
  • e = standard exponential function for output vector

Learned about Softmax Equation when learning about Temperature (LLMs)

Purpose

The purpose of the function is to convert a set of numbers into probabilities. Each number represents confidence level of a model that a particular data point belongs to a certain class.

Basically takes input and outputs probabilities (from my understanding) of that being the next thing generated

Similar to the Maxwell-Botlzmann Speed Distribution