You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I will add the experimental agent IMED-RL from the paper IMED-RL: Regret optimal learning of ergodic Markov decision processes, Fabien Pesquerel, Odalric-Ambrym Maillard
This is a learning policy that is asymptotically optimal with respect to regret minimization problem under the average-reward criterion in ergodic MDPs with unknown reward and transition.
The text was updated successfully, but these errors were encountered:
Hey,
I will add the experimental agent IMED-RL from the paper
IMED-RL: Regret optimal learning of ergodic Markov decision processes, Fabien Pesquerel, Odalric-Ambrym Maillard
This is a learning policy that is asymptotically optimal with respect to regret minimization problem under the average-reward criterion in ergodic MDPs with unknown reward and transition.
The text was updated successfully, but these errors were encountered: