系统仿真学报 ›› 2026, Vol. 38 ›› Issue (6): 1684-1698.doi: 10.16182/j.issn1004731x.joss.25-0682
• 论文 • 上一篇
任文哲1, 李敏1, 曾祥光1, 张滔1, 谢地杰1, 彭倍2
收稿日期:2025-07-16
修回日期:2025-09-15
出版日期:2026-06-25
发布日期:2026-06-25
通讯作者:
李敏
第一作者简介:任文哲(2001-),男,硕士生,研究方向为水下无人航行器、深度强化学习。
基金资助:Ren Wenzhe1, Li Min1, Zeng Xiangguang1, Zhang Tao1, Xie Dijie1, Peng Bei2
Received:2025-07-16
Revised:2025-09-15
Online:2026-06-25
Published:2026-06-25
Contact:
Li Min
摘要:
针对现有占位模型未能充分考虑水下时变洋流干扰和任务时间约束,且AUV缺乏实时运动控制问题,提出了一种基于分位数回归的分布式TD3最短时间占位方法。采用贝叶斯推断方法辨识水动力参数,建立AUV的运动学和动力学模型;构建最短时间占位方程,并对占位目标点和占位时间进行求解;引入一阶高斯-马尔可夫过程模拟时变洋流环境,并基于分布式TD3算法完成了不同强度洋流场景中的AUV占位控制策略训练。仿真结果表明:该方法在动态洋流干扰下具备良好的鲁棒性与适应性,特别在洋流强度较高时,相比TD3基线算法,策略收敛速度提升了30%,AUV占位准确度和成功率分别提高了63%和20%。
中图分类号:
任文哲,李敏,曾祥光等 . 基于改进TD3的AUV最短时间占位控制策略研究[J]. 系统仿真学报, 2026, 38(6): 1684-1698.
Ren Wenzhe,Li Min,Zeng Xiangguang,et al . Research on Control Strategy for Shortest Time Occupancy of AUV Based on Improved TD3[J]. Journal of System Simulation, 2026, 38(6): 1684-1698.
表8
弱洋流算法测试集对比结果
| 算法 | 占位坐标/m | 占位时间/s | 实际坐标/m | 实际时间/s | 平均坐标误差/m | 平均时间误差/s |
|---|---|---|---|---|---|---|
| DTD3 | (-41.71, 55.11) | 34.6 | (-39.77, 53.07) | 35.3 | 2.86 | 1.2 |
| (-63.55, 121.38) | 68.5 | (-61.98, 119.03) | 70.5 | |||
| (-32.49, 74.58) | 40.7 | (-30.47, 72.47) | 41.9 | |||
| TD3 | (-69.91, 101.51) | 61.6 | (-68.41, 98.92) | 62.6 | 3.00 | 1.7 |
| (-36.52, 68.18) | 38.6 | (-35.42, 65.48) | 40.3 | |||
| (-50.92, 102.99) | 57.4 | (-52.30, 100.21) | 59.6 |
表9
强洋流算法测试集对比结果
| 算法 | 占位坐标/m | 占位时间/s | 实际坐标/m | 实际时间/s | 平均坐标误差/m | 平均时间误差/s |
|---|---|---|---|---|---|---|
| DTD3 | (-21.02, 28.35) | 17.6 | (-16.23, 27.27) | 20.8 | 5.62 | 3.6 |
| (-49.37, 64.14) | 40.5 | (-44.96, 62.07) | 44.3 | |||
| (-36.33, 58.32) | 34.4 | (-32.53, 52.36) | 38.5 | |||
| TD3 | (-23.00, 34.61) | 20.7 | (-13.19, 20.21) | 24.9 | 11.08 | 4.1 |
| (-57.24, 80.49) | 49.3 | (-55.25, 73.94) | 53.5 | |||
| (-36.36, 50.96) | 31.2 | (-29.20, 45.51) | 35.4 |
| [1] | 潘云伟, 李敏, 曾祥光, 等. 基于人工势场和改进强化学习的自主式水下潜航器避障和航迹规划[J]. 兵工学报, 2025, 46(4): 70-81. |
| Pan Yunwei, Li Min, Zeng Xiangguang, et al. AUV Obstacle Avoidance and Path Planning Based on Artificial Potential Field and Improved Reinforcement Learning[J]. Acta Armamentarii, 2025, 46(4): 70-81. | |
| [2] | 任勇, 王景璟, 杜军, 等. 自主潜航器关键技术及应用[M]. 北京: 人民邮电出版社, 2021. |
| Ren Yong, Wang Jingjing, Du Jun, et al. Key Technologies and Applications of Autonomous Underwater Vehicles[M]. Beijing: Posts & Telecom Press, 2021. | |
| [3] | 童心赤. 水下航行器攻击占位研究[D]. 武汉: 武汉理工大学, 2021. |
| Tong Xinchi. Research on Occupying Attack Position of Unmanned Underwater Vehicle[D]. Wuhan: Wuhan University of Technology, 2021. | |
| [4] | 夏佩伦. 潜艇鱼雷攻击占位机动方案的确定与分析[J]. 火力与指挥控制, 2013, 38(11): 114-117. |
| Xia Peilun. Determination and Analysis of Getting-to-the-firing-position Maneuver Scheme for Submarine Attacking with Torpedo[J]. Fire Control & Command Control, 2013, 38(11): 114-117. | |
| [5] | 温洪, 魏石川, 陈志鹏, 等. 鱼雷攻击占位相关参数计算[J]. 指挥控制与仿真, 2008, 30(3): 58-60. |
| Wen Hong, Wei Shichuan, Chen Zhipeng, et al. Calculation on Relative Parameter of Favored Submarine Position for Torpedo Attack[J]. Command Control & Simulation, 2008, 30(3): 58-60. | |
| [6] | 吴志泉, 吴自飞, 李世雄. 基于局部放大法的舰船占领阵位求解方法[J]. 船舶工程, 2023, 45(增1): 389-392. |
| Wu Zhiquan, Wu Zifei, Li Shixiong. Solution Method of Occupying Ship Position Based on Partial Enlargement[J]. Ship Engineering, 2023, 45(S1): 389-392. | |
| [7] | 王钊, 王宏健, 张宏瀚, 等. UUV攻防博弈的自适应攻击占位机动决策研究[J]. 控制与决策, 2024, 39(11): 3819-3828. |
| Wang Zhao, Wang Hongjian, Zhang Honghan, et al. Adaptive Attack Occupancy Maneuver Decision of UUV Attack-defense Game[J]. Control and Decision, 2024, 39(11): 3819-3828. | |
| [8] | 宋保维, 姜军, 王鹏, 等. 基于Markov过程的潜艇占位能力模型研究[J]. 鱼雷技术, 2007, 15(4): 45-48. |
| Song Baowei, Jiang Jun, Wang Peng, et al. Modelling Taking-up Position Capability of Submarine Based on Markov Chain[J]. Torpedo Technology, 2007, 15(4): 45-48. | |
| [9] | Tong Xinchi, Zhang Huajun, Guo Hang. Research on Occupancy Maneuvering Scheme of Unmanned Underwater Vehicle[C]//2020 Chinese Control and Decision Conference (CCDC). Piscataway: IEEE, 2020: 3478-3483. |
| [10] | Safari Farhad, Rafeeyan Mansour, Danesh Mohammad. Estimation of Hydrodynamic Coefficients and Simplification of the Depth Model of an AUV Using CFD and Sensitivity Analysis[J]. Ocean Engineering, 2022, 263: 112369. |
| [11] | 高婷, 庞永杰, 王亚兴, 等. 水下航行器水动力系数计算方法[J]. 哈尔滨工程大学学报, 2019, 40(1): 174-180. |
| Gao Ting, Pang Yongjie, Wang Yaxing, et al. Calculation Method of Hydrodynamic Coefficients for Underwater Vehicles[J]. Journal of Harbin Engineering University, 2019, 40(1): 174-180. | |
| [12] | Wang Xu, Wang Sen, Liang Xingxing, et al. Deep Reinforcement Learning: A Survey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(4): 5064-5078. |
| [13] | Tai Lei, Paolo Giuseppe, Liu Ming. Virtual-to-real Deep Reinforcement Learning: Continuous Control of Mobile Robots for Mapless Navigation[C]//2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Piscataway: IEEE, 2017: 31-36. |
| [14] | Zhelo Oleksii, Zhang Jingwei, Tai Lei, et al. Curiosity-driven Exploration for Mapless Navigation with Deep Reinforcement Learning[EB/OL]. (2018-05-14) [2025-07-06]. . |
| [15] | Fujimoto Scott, Hoof Herke, Meger David. Addressing Function Approximation Error in Actor-critic Methods[C]//Proceedings of the 35th International Conference on Machine Learning. Chia Laguna Resort: PMLR, 2018: 1587-1596. |
| [16] | Dabney W, Ostrovski G, Silver D, et al. Implicit Quantile Networks for Distributional Reinforcement Learning[C]//Proceedings of the 35th International Conference on Machine Learning. Chia Laguna Resort: PMLR, 2018: 1096-1105. |
| [17] | Bellemare M G, Dabney W, Munos Rémi. A Distributional Perspective on Reinforcement Learning[C]//Proceedings of the 34th International Conference on Machine Learning. Chia Laguna Resort: PMLR, 2017: 449-458. |
| [18] | Dabney W, Rowland M, Bellemare M G, et al. Distributional Reinforcement Learning with Quantile Regression[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence. Palo Alto: AAAI Press, 2018: 2892-2901. |
| [19] | Mavrin Borislav, Yao Hengshuai, Kong Linglong, et al. Distributional Reinforcement Learning for Efficient Exploration[C]//Proceedings of the 36th International Conference on Machine Learning. Chia Laguna Resort: PMLR, 2019: 4424-4434. |
| [20] | 刘开周, 赵洋. 水下机器人建模与仿真技术[M]. 北京: 科学出版社, 2020. |
| Liu Kaizhou, Zhao Yang. Modeling and Simulation Technology for Underwater Vehicle[M]. Beijing: Science Press, 2020. | |
| [21] | Ahmed Faheem, Xiang Xianbo, Zhou Guangzhao, et al. Dynamic Modeling and Maneuvering of Remus 100 AUV: The Impact of Added Mass Coefficients[C]//2023 42nd Chinese Control Conference (CCC). Piscataway: IEEE, 2023: 1424-1429. |
| [22] | Hong Lin, Fang Renjie, Cai Xiaotian, et al. Numerical Investigation on Hydrodynamic Performance of a Portable AUV[J]. Journal of Marine Science and Engineering, 2021, 9(8): 812. |
| [1] | 孟文龙, 濮彦博, 龚亚. 未知环境下融合局部-全局策略的AUV路径规划[J]. 系统仿真学报, 2026, 38(4): 889-902. |
| [2] | 罗毅, 邓嘉. 基于改进RRT-Connect与DWA融合的移动机器人路径规划[J]. 系统仿真学报, 2025, 37(10): 2545-2556. |
| [3] | 刘晓德, 郭宇飞, 陈元培, 周洁, 张瑀涵, 彭玮航, 马喆. 基于脉冲强化学习的连续运动控制仿真与优化[J]. 系统仿真学报, 2025, 37(10): 2662-2671. |
| [4] | 张艳, 李炳华, 霍涛, 刘榕. 融合改进A*算法与DWA算法的机器人动态避障方法研究[J]. 系统仿真学报, 2025, 37(6): 1555-1564. |
| [5] | 张淑珍, 成煜坤, 刘杨波, 查富生. 基于扩展图像特征的无标定视觉伺服方法[J]. 系统仿真学报, 2025, 37(5): 1210-1221. |
| [6] | 喻蝶, 鲍柏仲, 司言, 段暕, 詹小斌, 史铁林. 基于搜索步优化A*算法的移动机器人路径规划[J]. 系统仿真学报, 2025, 37(4): 1041-1050. |
| [7] | 金煦, 莫愿斌. 多策略混合山地瞪羚优化器在机器人路径规划问题中的应用[J]. 系统仿真学报, 2025, 37(3): 803-821. |
| [8] | 李炯逸, 李强, 张新闻, Htet Zin Myo, 蔡永斌. 移动机器人用改进的双向A*二次路径规划算法[J]. 系统仿真学报, 2025, 37(2): 498-507. |
| [9] | 许建民, 宋雷, 邓冬冬, 陈尧箬, 杨炜. 基于多尺度A*与优化DWA算法融合的移动机器人路径规划[J]. 系统仿真学报, 2025, 37(1): 257-270. |
| [10] | 李维刚, 余楚翔, 王永强, 邹少峰. 基于分布优化配准的实时激光SLAM算法[J]. 系统仿真学报, 2024, 36(11): 2566-2577. |
| [11] | 丁开源, 艾斯卡尔·艾木都拉, 朱斌, 伊克萨尼·普尔凯提, 马正堂. 基于多模态深度强化学习的端到端无人车运动规划[J]. 系统仿真学报, 2024, 36(11): 2631-2643. |
| [12] | 姬鹏, 张新元, 高帅轩, 魏铄让. 融合改进A*算法与动态窗口法的路径规划研究[J]. 系统仿真学报, 2024, 36(9): 2171-2180. |
| [13] | 孙海杰, 伞红军, 肖乐, 姚得鑫, 陈久朋, 杨晓园. 一种改进的移动机器人路径规划算法[J]. 系统仿真学报, 2024, 36(9): 2193-2207. |
| [14] | 张瑞, 周丽, 刘正洋. 融合RRT*与DWA算法的移动机器人动态路径规划[J]. 系统仿真学报, 2024, 36(4): 957-968. |
| [15] | 王鑫鹏, 傅汇乔, 邓归洲, 唐开强, 陈春林, 留沧海. 基于DRL和自由步态的六足机器人运动规划研究[J]. 系统仿真学报, 2024, 36(2): 373-384. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||