摘要
To alleviate the low sample efficiency problem in deep reinforcement learning, imitation learning, or called apprenticeship learning, is one of the potential approaches, which leverages the expert demonstrations in sequential decision-making process. In order to provide the readers a comprehensive understanding about how to effectively extract information from the demonstration data, we introduce the most important categories in imitation learning, including behavioral cloning, inverse reinforcement learning, imitation learning from observations, probabilistic methods, and other methods. Imitation learning can either be regarded as an initialization or a guidance for training the agent in the scope of reinforcement learning. Combination of imitation learning and reinforcement learning is a promising direction for efficient learning and faster policy optimization in practice.
摘要译文
缓解深度强化学习,模仿学习或学徒学习中的样本效率低问题是一种潜在的方法,该方法在顺序决策过程中利用了专家论证。为了向读者提供有关如何从演示数据中有效提取信息的全面理解,我们介绍了模仿学习中最重要的类别,包括行为克隆,逆强化学习,基于观察的模仿学习,概率方法和其他方法。模仿学习可以被视为在强化学习范围内培训代理的初始化或指导。模仿学习与强化学习相结合是有效学习和更快地在实践中优化政策的有希望的方向。
Zihan Ding.1. Imitation Learning. Deep Reinforcement Learning[M].DE: Springer, 2020: 273-306