A diagonal Gaussian policy always has neural networks that maps from observations to mean actions

First Way:

Gaussian distribution is described by a mean vector , and a covariance matrix