[1] 姜涛, 王建中, 施家栋. 小型移动机器人自主返航路径规划方法[J]. 计算机工程, 2015, 41(1): 164-168. Jiang Tao, Wang Jianzhong, Shi Jiadong, Autonomous Return Path Planning for Small Mobile Robots[J]. Computer Engineering, 2015, 41(1): 164-168. [2] 刘洁, 赵海芳, 周德廉. 一种改进量子行为粒子群优化算法的移动机器人路径规划[J]. 计算机科学, 2017, 44(增2): 123-128. Liu Jie, Zhao Haifang, Zhou Delian.Path Planning of Mobile Robot Based on Improved Quantum Behavior Particle Swarm Optimization[J]. Computer Engineering, 2017, 44(S2): 123-128. [3] 赵晓, 王铮, 黄程侃, 等.基于改进A*算法的移动机器人路径规划[J]. 机器人, 2018, 40(6): 903-910. Zhao Xiao, Wang Zheng, Huang Chengkan, et al.Path Planning of Mobile Robot Based on Improved A* Algorithm[J]. Robot, 2018, 40(6): 903-910. [4] 郭鹏, 余建波. 基于深度强化学习的制造过程Run-to-Run控制[[J/OL]]. 自动化学报, [2021-02-06].https://doi.org/10.16383/j.aas.c190546. Guo Peng, Yu Jianbo, Run-to-Run Control of Manufacturing Process Based on Deep Reinforcement Learning[J]. Acta Automatica Sinica, [2021-02-06]. https://doi.org/10.16383/j.aas.c190546. [5] Hasselt H V, Guez A, Silver D.Deep Reinforcement Learning with Double Q-learning[C]// Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16). Arizona, USA: AAAI, 2016: 2094-2100. [6] 王大方. 基于深度强化学习的机器人导航研究[D]. 徐州: 中国矿业大学, 2019. Wang Dafang.Research on Robot Navigation Based on Deep Reinforcement Learning[D]. Xuzhou: China University of Mining and Technology, 2019. [7] 邓悟. 基于深度强化学习的智能体避障与路径规划研究与应用[D]. 成都: 电子科技大学, 2019. Deng Wu.Research and Application of Agent Obstacle Avoidance and Path Planning Based on Deep Reinforcement Learning[D]. Chengdu: University of Electronic Science and Technology, 2019. [8] 江其洲, 曾碧. 基于深度强化学习的移动机器人导航策略研究[J]. 计算机测量与控制, 2019, 27(8): 217-221. Jiang Qizhou, Zeng Bi.Research on Navigation Strategy of Mobile Robot Based on Deep Reinforcement Learning[J]. Computer Measurement and Control, 2019, 27(8): 217-221. [9] 张心怡, 张智鹏, 张铁赢, 等. RLO: 一个基于强化学习的连接优化方法[J]. 中国科学: 信息科学, 2020, 50(5): 637-648. Zhang Xinyi, Zhang Zhipeng, Zhang Tieying, et al.RLO: a Reinforcement Learning-based Method for Join Optimization[J]. Scientia Sinica (Informationis), 2020, 50(5): 637-648. [10] 乔俊飞, 侯占军, 阮晓钢. 基于神经网络的强化学习在避障中的应用[J]. 清华大学学报(自然科学版), 2008(增2): 1747-1750. Qiao Junfei, Hou Zhanjun, Ruan Xiaogang.Application of Reinforcement Learning Based on Neural Network in Obstacle Avoidance[J]. Journal of Tsinghua University (Science and Technology), 2008(S2): 1747-1750. [11] 王毅然, 经小川, 田涛, 等. 基于强化学习的多Agent路径规划方法研究[J]. 计算机应用与软件, 2019, 36(8): 165-171. Wang Yiran, Jing Xiaochuan, Tian Tao, et al.Research on Multi-agent Path Planning Method Based on Reinforcement Learning[J]. Multi-agent Path Planning based on Reinforcement Learning, 2019, 36(8): 165-171. [12] 高慧. 基于强化学习的移动机器人路径规划研究[D].成都: 西南交通大学, 2016. Gao Hui.Mobile Robot Path Planning Based on Deep Reinforcement Learning[D]. Chengdu: Southwest Jiaotong University, 2016. [13] 李鹤宇, 赵志龙, 顾蕾, 等. 基于深度强化学习的机械臂控制方法[J]. 系统仿真学报. 2019, 31(11): 2452-2457. Li Heyu, Zhao Zhilong, Gu Lei, et al.Robot Arm Control Method Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2019, 31(11): 2452-2457. [14] 周建频, 张姝柳. 基于深度强化学习的动态库存路径优化[J]. 系统仿真学报. 2019, 31(10): 2155-2163. Zhou Jianpin, Zhang Shuliu.Dynamic Inventory Path Optimization Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2019, 31(10): 2155-2163. [15] 闫丰亭, 贾金原. 基于深度学习序贯检验的电源车故障诊断方法[J]. 系统仿真学报, 2019, 31(1): 16-26. Yan Fengting, Jia Jinyuan.Power Vehicle Fault Diagnosis Method based on Deep Learning Sequential Inspection[J]. Journal of System Simulation, 2019, 31(1): 16-26. [16] 刘建伟, 高峰, 罗雄麟. 基于值函数和策略梯度的深度强化学习综述[J]. 计算机学报, 2019, 42(6): 1406-1438. Liu Jianwei, Gao Feng, Luo Xionglin.Survey of Deep Reinforcement Learning Based on Value Function and Policy Gradient[J]. Chinese Journal of Computers, 2019, 42(6): 1406-1438. [17] 徐志雄, 曹雷, 陈希亮, 等. 基于强化学习的无人坦克对战仿真研究[J]. 计算机工程与应用, 2018, 13(8): 266-272. Xu Zhixiong, Cao Lei, Chen Xiliang.Research on the Simulation of Unmanned Tank Battle based on Reinforcement Learning[J]. Computer Engineering and Application, 2018, 13(8): 266-272. [18] 刘全, 闫岩, 朱斐, 等. 一种带探索噪音的深度循环Q网络[J]. 计算机学报, 2019, 42(7): 1588-1604. Liu Quan, Yan Yan, Zhu Fei, et al.A Deep Recurrent Q Network with Exploratory Noise[J]. Chinese Journal of Computers, 2019, 42(7): 1588-1604. [19] Sutton R S, Barto A G.Reainforment Learning: An Introduction[M]. Cambridge: The MIT Press, 1998. [20] 何柳柳, 杨羊, 李征, 等. 面向持续集成测试优化的强化学习奖励机制[J]. 软件学报, 2019, 30(5): 1438-1449. He Liuliu, Yang Yang, Li Zheng, et al.Reward of Reinforcement Learning of Test Optimization for Continuous Integration[J]. Journal of Software, 2019, 30(5): 1438-1449. [21] 杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学, 2019, 46(8): 1-8. Du Wei, Ding Shifei.Overview on Multi-agent Reinforcement Learning[J]. Computer Science, 2019, 46(8): 1-8. [22] 李波. 基于分层强化学习的多Agent路径规划与编队方法研究[D]. 新乡: 河南师范大学. 2016. Li Bo.Research on Multi-agent Path Planning and Formation Method based on Hierarchical Reinforcement Learning[D]. Xinxiang: Henan Normal University, 2016. |