Categorial Policy is like a classifier over discrete actions.
You build the Neural Network for a categorial policy the same way you’d for a classifier
Log-Likelihood. Denote the last layer of probabilities as P_{\theta}(s). It is a vector with however many entries as there are actions, so we can treat the actions as indices for the vector. The log likelihood for an action a can then be obtained by indexing into the vector: