系统仿真学报 ›› 2026, Vol. 38 ›› Issue (6): 1684-1698.doi: 10.16182/j.issn1004731x.joss.25-0682

• 论文 • 上一篇    

基于改进TD3的AUV最短时间占位控制策略研究

任文哲1, 李敏1, 曾祥光1, 张滔1, 谢地杰1, 彭倍2   

  1. 1.西南交通大学 机械工程学院,四川 成都 610031
    2.电子科技大学,四川 成都 611731
  • 收稿日期:2025-07-16 修回日期:2025-09-15 出版日期:2026-06-25 发布日期:2026-06-25
  • 通讯作者: 李敏
  • 第一作者简介:任文哲(2001-),男,硕士生,研究方向为水下无人航行器、深度强化学习。
  • 基金资助:
    国家自然科学基金(52075456);四川省科技厅重点研发计划(2023YFG0285);四川省科技厅重点研发计划(2019ZDZX0020)

Research on Control Strategy for Shortest Time Occupancy of AUV Based on Improved TD3

Ren Wenzhe1, Li Min1, Zeng Xiangguang1, Zhang Tao1, Xie Dijie1, Peng Bei2   

  1. 1.School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, China
    2.University of Electronic Science and Technology of China, Chengdu 611731, China
  • Received:2025-07-16 Revised:2025-09-15 Online:2026-06-25 Published:2026-06-25
  • Contact: Li Min

摘要:

针对现有占位模型未能充分考虑水下时变洋流干扰和任务时间约束,且AUV缺乏实时运动控制问题,提出了一种基于分位数回归的分布式TD3最短时间占位方法。采用贝叶斯推断方法辨识水动力参数,建立AUV的运动学和动力学模型;构建最短时间占位方程,并对占位目标点和占位时间进行求解;引入一阶高斯-马尔可夫过程模拟时变洋流环境,并基于分布式TD3算法完成了不同强度洋流场景中的AUV占位控制策略训练。仿真结果表明:该方法在动态洋流干扰下具备良好的鲁棒性与适应性,特别在洋流强度较高时,相比TD3基线算法,策略收敛速度提升了30%,AUV占位准确度和成功率分别提高了63%和20%。

关键词: 自主式水下潜航器, 水下占位, TD3算法, 贝叶斯推断, 最短占位时间

Abstract:

Existing occupancy models fail to fully consider the interference of underwater time-varying ocean currents and task time constraints, and AUVs lacks real-time motion control. To address these issues, a shortest time occupancy method based on quantile regression and distributed TD3 was proposed. The Bayesian inference method was used to identify hydrodynamic parameters, and the kinematic and dynamic models of AUVs were established; the shortest time occupancy equation was constructed, and the occupancy target point and occupancy time were solved; a first-order Gauss-Markov process was introduced to simulate the time-varying ocean current environment, and the training of control strategy for AUV occupancy in ocean current scenarios with different intensities was completed based on the distributed TD3 algorithm. The simulation results indicate that this method exhibits good robustness and adaptability under dynamic ocean current interference. Especially when the ocean current intensity is high, compared with the TD3 baseline algorithm, the strategy convergence rate is improved by 30%, and the accuracy and success rate of AUV occupancy are improved by 63% and 20%, respectively.

Key words: autonomous underwater vehicle, underwater occupancy, TD3 algorithm, Bayesian inference, shortest occupancy time

中图分类号: