举例理解监督学习、无监督学习、半监督学习和强化学习的区别:https://www.jianshu.com/p/56fe011d9bae
学习笔记——深度Q-Learning(Deep Q-Learing(DQN)):https://www.jianshu.com/p/72cab5460ebe
(高阳)强化学习研究综述:https://wenku.baidu.com/view/11cb573f770bf78a642954b6.html
DQN:https://blog.csdn.net/bbbeoy/article/details/79072083
DQN:https://cloud.tencent.com/developer/article/1048358
通过Q-learning理解强化学习:http://baijiahao.baidu.com/s?id=1597978859962737001&wfr=spider&for=pc
DUeling DQN:https://www.cnblogs.com/pinard/p/9923859.html
基于连续动作空间的行动者评论家方法研究(硕士论文)http://www.doc88.com/p-5436417127419.html
cs229笔记:强化学习 https://blog.csdn.net/AMDS123/article/details/68956814