Markov Decision Process (MDP) is a mathematical model for sequential decision making when outcomes are uncertain
Goal of MDP is to find a good “policy” for the decision maker, a function that specifies the action that the decision maker will choose when in state