Journal of System Simulation ›› 2025, Vol. 37 ›› Issue (11): 2754-2767.doi: 10.16182/j.issn1004731x.joss.24-0678
• Papers • Previous Articles
Xing Lijing1, Li Min1, Zeng Xiangguang1, Zhang Ping2, Peng Bei2
Received:2024-06-26
Revised:2024-09-11
Online:2025-11-18
Published:2025-11-27
Contact:
Li Min
CLC Number:
Xing Lijing, Li Min, Zeng Xiangguang, Zhang Ping, Peng Bei. AUV Path Planning Based on Behavior Cloning and Improved DQN in Partially Unknown Environments[J]. Journal of System Simulation, 2025, 37(11): 2754-2767.
Table 2
Comparison of different models' performance after 2 000 training epochs
| 算法名称 | 是否扩张经验池 | 随机采样 | 穿统 采样 | 新采样 | ||||
|---|---|---|---|---|---|---|---|---|
| EPRO_A_DQN | √ | × | × | √ | 8/10 | 9.486/75.86 | 0.193/1.540 | 3.62/28.97 |
| EPER_A_DQN | √ | × | √ | × | 5/10 | 9.730/48.64 | 0.224/1.120 | 4.40/22.00 |
| ER_A_DQN | √ | √ | × | × | 5/10 | 11.46/57.28 | 0.227/1.130 | 4.59/22.94 |
| PRO_A_DQN | × | × | × | √ | 4/10 | 12.24/48.97 | 0.289/1.157 | 5.42/21.66 |
| PER_A_DQN | × | × | √ | × | 4/10 | 15.63/62.53 | 0.292/1.168 | 4.84/19.34 |
| A_DQN | × | √ | × | × | 3/10 | 24.59/73.79 | 0.648/1.944 | 8.97/26.91 |
| DQN | × | × | × | × | 0/10 | -/46.43 | -/0.940 | -/18.53 |
Table 5
Comparison of time efficiency across five experimental groups in four scenarios
| 场景 | 算法 | 实验1 | 实验2 | 实验3 | 实验4 | 实验5 | 平均用时 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Ⅰ | BA_DQN | 0.33 | 21.6 | 0.30 | 25.5 | 0.32 | 24.6 | 0.31 | 17.1 | 0.31 | 29.2 | 0.314 | 23.60 | 11.8 |
| A* | 0.40 | 25.9 | 0.32 | 29.3 | 0.35 | 25.0 | 0.34 | 18.4 | 0.37 | 34.3 | 0.356 | 26.58 | ||
| Ⅱ | BA_DQN | 0.39 | 24.8 | 0.38 | 30.7 | 0.35 | 29.3 | 0.37 | 29.3 | 0.37 | 27.7 | 0.372 | 28.36 | 17.7 |
| A* | 0.42 | 30.4 | 0.49 | 33.8 | 0.39 | 32.7 | 0.46 | 34.5 | 0.50 | 34.1 | 0.452 | 33.10 | ||
| Ⅲ | BA_DQN | 0.59 | 44.3 | 0.56 | 39.5 | 0.59 | 40.3 | 0.54 | 44.6 | 0.49 | 38.1 | 0.554 | 41.36 | 19.7 |
| A* | 0.86 | 47.0 | 0.60 | 42.1 | 0.73 | 42.8 | 0.61 | 46.1 | 0.65 | 44.4 | 0.690 | 44.48 | ||
| Ⅳ | BA_DQN | 1.39 | 103.9 | 1.33 | 109.0 | 1.38 | 95.1 | 1.50 | 98.1 | 1.43 | 97.0 | 1.406 | 100.62 | 36.0 |
| A* | 1.85 | 104.7 | 1.86 | 129.3 | 2.53 | 97.5 | 2.46 | 114.3 | 2.29 | 116.8 | 2.198 | 112.52 | ||
| [1] | Takács Bence, Dóczi Roland, Sütő Balázs, et al. Extending AUV Response Robot Capabilities to Solve Standardized Test Methods[J]. Acta Polytechnica Hungarica, 2016, 13(1): 157-170. |
| [2] | 赵苗, 高永琪, 吴笛霄, 等. 复杂海战场环境下AUV全局路径规划方法[J]. 国防科技大学学报, 2021, 43(1): 41-48. |
| Zhao Miao, Gao Yongqi, Wu Dixiao, et al. AUV Global Path Planning Method in Complex Sea Battle Field Environment[J]. Journal of National University of Defense Technology, 2021, 43(1): 41-48. | |
| [3] | 郭银景, 侯佳辰, 吴琪, 等. AUV全局路径规划环境建模算法研究进展[J]. 舰船科学技术, 2021, 43(17): 12-18. |
| Guo Yinjing, Hou Jiachen, Wu Qi, et al. Research Progress of AUV Global Path Planning Environment Modeling Algorithm[J]. Ship Science and Technology, 2021, 43(17): 12-18. | |
| [4] | 胡春磊, 章飞, 曾庆军. 基于多目标蚁群策略的AUV全局路径规划算法[J]. 传感器与微系统, 2020, 39(11): 107-109, 113. |
| Hu Chunlei, Zhang Fei, Zeng Qingjun. Global Path Planning Algorithm for AUV Based on Multi-objective Ant Colony Strategy[J]. Transducer and Microsystem Technologies, 2020, 39(11): 107-109, 113. | |
| [5] | 李世奇, 孙兵, 朱蟋蟋. 海流环境下基于改进D∗算法的AUV动态路径规划[J]. 高技术通讯, 2022, 32(1): 84-92. |
| Li Shiqi, Sun Bing, Zhu Xixi. Autonomous Underwater Vehicles Dynamic Path Planning Based on Improved D* Algorithm in Ocean Current Environment[J]. Chinese High Technology Letters, 2022, 32(1): 84-92. | |
| [6] | 洪晔, 王宏健, 边信黔. 基于分层马尔可夫决策过程的AUV全局路径规划研究[J]. 系统仿真学报, 2008, 20(9): 2361-2363, 2367. |
| Hong Ye, Wang Hongjian, Bian Xinqian. Global Path Planning for AUV Based on Hierarchical Markov Decision Processes[J]. Journal of System Simulation, 2008, 20(9): 2361-2363, 2367. | |
| [7] | 王磊, 刘晶晶, 齐俊艳, 等. 基于改进人工势场法的AUV全局路径规划[J]. 河南理工大学学报(自然科学版), 2024, 43(1): 132-139. |
| Wang Lei, Liu Jingjing, Qi Junyan, et al. A Global Path Planning Algorithm for AUV Based on Improved Artificial Potential Field Method[J]. Journal of Henan Polytechnic University(Natural Science), 2024, 43(1): 132-139. | |
| [8] | 薛双飞. 基于改进A*算法的近海船舶路径规划[D]. 武汉: 武汉理工大学, 2018. |
| Xue Shuangfei. Route Planning of Offshore Ships Based on Improved A* Algorithm[D]. Wuhan: Wuhan University of Technology, 2018. | |
| [9] | McColgan J, McGookin E W, Mazlan A N A. A Low Fidelity Mathematical Model of a Biomimetic AUV for Multi-vehicle Cooperation[C]//OCEANS 2015 - Genova. Piscataway: IEEE, 2015: 1-10. |
| [10] | Luo Lei, Zhao Ning, Zhu Yi, et al. A* Guiding DQN Algorithm for Automated Guided Vehicle Pathfinding Problem of Robotic Mobile Fulfillment Systems[J]. Computers & Industrial Engineering, 2023, 178: 109112. |
| [11] | Yang Yang, Li Juntao, Peng Lingling. Multi-robot Path Planning Based on a Deep Reinforcement Learning DQN Algorithm[J]. CAAI Transactions on Intelligence Technology, 2020, 5(3): 177-183. |
| [12] | Zhang Yaoyu, Li Caihong, Zhang Guosheng, et al. Research on the Local Path Planning for Mobile Robots Based on PRO-dueling Deep Q-network (DQN) Algorithm[J]. International Journal of Advanced Computer Science and Applications, 2023, 14(8): 381-387. |
| [13] | 罗磊, 赵宁, 任成栋. 基于行为克隆和奖励重构的AGV路径规划算法[J/OL]. 计算机集成制造系统. (2023-10-24)[2024-03-27]. . |
| Luo Lei, Zhao Ning, Ren Chengdong. Reinforcement Learning Algorithm for AGV Path Planning Based on Behavioral Cloning and Reward Reconstruction[J/OL]. Computer Integrated Manufacturing Systems. (2023-10-24)[2024-03-27]. . | |
| [14] | 周娴玮, 包明豪, 叶鑫, 等. 带Q网络过滤的两阶段TD3深度强化学习方法[J]. 计算机技术与发展, 2023, 33(10): 101-108. |
| Zhou Xianwei, Bao Minghao, Ye Xin, et al. Two-stage TD3 Deep Reinforcement Learning Algorithm with Q Network Filtration[J]. Computer Technology and Development, 2023, 33(10): 101-108. | |
| [15] | Yang Jiachen, Ni Jingfei, Xi Meng, et al. Intelligent Path Planning of Underwater Robot Based on Reinforcement Learning[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(3): 1983-1996. |
| [16] | Yang Xiao, Han Qilong. Improved DQN for Dynamic Obstacle Avoidance and Ship Path Planning[J]. Algorithms, 2023, 16(5): 220. |
| [17] | Lou Ping, Xu Kun, Jiang Xuemei, et al. Path Planning in an Unknown Environment Based on Deep Reinforcement Learning with Prior Knowledge[J]. Journal of Intelligent & Fuzzy Systems, 2021, 41(6): 5773-5789. |
| [18] | Wenzel Pilar von Pilchau, Stein Anthony, Hähner Jörg, et al. Synthetic Experiences for Accelerating DQN Performance in Discrete Non-deterministic Environments[J]. Algorithms, 2021, 14(8): 226. |
| [19] | Zeyad Abd Algfoor, Mohd Shahrizal Sunar, Kolivand Hoshang. A Comprehensive Study on Pathfinding Techniques for Robotics and Video Games[J]. International Journal of Computer Games Technology, 2015, 2015(1): 736138. |
| [1] | Jiang Ming, He Tao. Solving the Vehicle Routing Problem Based on Deep Reinforcement Learning [J]. Journal of System Simulation, 2025, 37(9): 2177-2187. |
| [2] | Yu Yiran, Lai Huicheng, Gao Guxue, Zhang Guo, Peng Wangyinan, Yang Longfei, Huang Junhao. Optimization Method for Multi Agricultural Machinery Collaborative Operation Based on Genetic Algorithm and A * Algorithm [J]. Journal of System Simulation, 2025, 37(9): 2397-2408. |
| [3] | Ni Peilong, Mao Pengjun, Wang Ning, Yang Mengjie. Robot Path Planning Based on Improved A-DDQN Algorithm [J]. Journal of System Simulation, 2025, 37(9): 2420-2430. |
| [4] | Zhang Kaixiang, Mao Jianlin, Wang Niya, Xu Zhihao. Multi-robot Hierarchical Collaborative k-robust Path Planning for Path Interference [J]. Journal of System Simulation, 2025, 37(8): 2074-2088. |
| [5] | Chen Zhen, Wu Zhuoyi, Zhang Lin. Research on Policy Representation in Deep Reinforcement Learning [J]. Journal of System Simulation, 2025, 37(7): 1753-1769. |
| [6] | Wan Yuhang, Zhu Zilu, Zhong Chunfu, Liu Yongkui, Lin Tingyu, Zhang Lin. Dynamic Path Planning for Robotic Arms Based on an Improved PPO Algorithm [J]. Journal of System Simulation, 2025, 37(6): 1462-1473. |
| [7] | Ye Chen, Shao Peng, Zhang Shaoping, Li Wenting, Zhou Tengming. Enhanced Artificial Gorilla Algorithm for Mobile Robot Path Planning [J]. Journal of System Simulation, 2025, 37(6): 1474-1485. |
| [8] | Zhang Yan, Li Binghua, Huo Tao, Liu Rong. Research on Robot Dynamic Obstacle Avoidance Method Based on Improved A* and Dynamic Window Algorithm [J]. Journal of System Simulation, 2025, 37(6): 1555-1564. |
| [9] | Gu Xueqiang, Luo Junren, Zhou Yanzhong, Zhang Wanpeng. Survey on Large Language Agent Technologies for Intelligent Game Theoretic Decision-making [J]. Journal of System Simulation, 2025, 37(5): 1142-1157. |
| [10] | Wu Guohua, Zeng Jiaheng, Wang Dezhi, Zheng Long, Zou Wei. A Quadrotor Trajectory Tracking Control Method Based on Deep Reinforcement Learning [J]. Journal of System Simulation, 2025, 37(5): 1169-1187. |
| [11] | Zhou Xiaohui, Li Yanqiang, Wang Yong, Zhao Decai, Yang Xiaoyao. Robot Path Planning Based on Ant Colony Algorithm with Dual Heuristic Information [J]. Journal of System Simulation, 2025, 37(5): 1280-1289. |
| [12] | Yu Die, Bao Baizhong, Si Yan, Duan Jian, Zhan Xiaobin, Shi Tielin. Mobile Robot Path Planning Based on Search-step Optimized A* Algorithm [J]. Journal of System Simulation, 2025, 37(4): 1041-1050. |
| [13] | Xu Ming, Li Jinye, Zuo Dongyu, Zhang Jing. Signal Timing Optimization via Reinforcement Learning with Traffic Flow Prediction [J]. Journal of System Simulation, 2025, 37(4): 1051-1062. |
| [14] | Zhang Sen, Dai Qiangqiang. UAV Path Planning Based on Improved Deep Deterministic Policy Gradients [J]. Journal of System Simulation, 2025, 37(4): 875-881. |
| [15] | He Zhigang, Li Dayan, Wang Niya, Mao Jianlin, Wang Ning. A Multi-robot Collaborative Path Planning Algorithm with Chain Working Mode [J]. Journal of System Simulation, 2025, 37(4): 953-967. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||