A diagonal Gaussian policy always has neural networks that maps from observations to mean actions
First Way:
Gaussian distribution is described by a mean vector , and a covariance matrix
A diagonal Gaussian policy always has neural networks that maps from observations to mean actions
First Way:
Gaussian distribution is described by a mean vector , and a covariance matrix