Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning Harshad Khadilkar, Tanuja Ganu, Deva P Seetharam https://arxiv.org/abs/2003.14093