系统仿真学报 ›› 2024, Vol. 36 ›› Issue (7): 1670-1681.doi: 10.16182/j.issn1004731x.joss.23-0443
收稿日期:2023-04-14
修回日期:2023-06-01
出版日期:2024-07-15
发布日期:2024-07-12
通讯作者:
荣里
E-mail:gogxue@163.com;33574319@qq.com
第一作者简介:龚雪(1998-),女,硕士生,研究方向为人工智能与大数据。Email:gogxue@163.com
基金资助:
Gong Xue1(
), Peng Pengfei1, Rong Li1(
), Zheng Yalian2, Jiang Jun1
Received:2023-04-14
Revised:2023-06-01
Online:2024-07-15
Published:2024-07-12
Contact:
Rong Li
E-mail:gogxue@163.com;33574319@qq.com
摘要:
针对任务分析中任务协同交互耦合度高、影响因素繁多等问题,提出了基于序列解耦与深度强化学习的任务分析方法,实现了复杂约束条件下的任务分解及任务序列重构。设计了基于任务信息交互的深度强化学习环境,基于目标网络与评估网络损失函数间的差值改进
中图分类号:
龚雪,彭鹏菲,荣里等 . 基于深度强化学习的任务分析方法[J]. 系统仿真学报, 2024, 36(7): 1670-1681.
Gong Xue,Peng Pengfei,Rong Li,et al . Task Analysis Methods Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2024, 36(7): 1670-1681.
| 1 | 马悦, 吴琳, 刘昀, 等. 作战任务优选建模及求解方法研究[J]. 系统仿真学报, 2023, 35(3): 470-483. |
| Ma Yue, Wu Lin, Liu Yun, et al. Research on Modeling and Solution Method of Operational Tasks Optimization[J]. Journal of System Simulation, 2023, 35(3): 470-483. | |
| 2 | 贾正荣, 卢发兴, 王航宇. 基于解耦优化和环流APF的多平台协同攻击任务规划[J]. 北京航空航天大学学报, 2020, 46(6): 1142-1150. |
| Jia Zhengrong, Lu Faxing, Wang Hangyu. Multi-platform Cooperative Task Planning with Decoupling Optimization and Circulating APF[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(6): 1142-1150. | |
| 3 | 王晨旭, 王晓晨, 余敦辉, 等. 基于动态解耦的软件众包任务分解算法[J]. 计算机工程, 2019, 45(8): 120-124, 134. |
| Wang Chenxu, Wang Xiaochen, Yu Dunhui, et al. Software Crowdsourcing Task Decomposition Algorithm Based on Dynamic Decoupling[J]. Computer Engineering, 2019, 45(8): 120-124, 134. | |
| 4 | 杨伟刚, 张永永. 2020年以来美国国民警卫队遂行任务解析[J]. 中国军转民, 2021(15): 49-50. |
| 5 | 吴红芳, 任南, 马梦园. 基于FDSM模型的WBS任务耦合问题的研究[J]. 上海管理科学, 2016, 38(6): 76-79. |
| Wu Hongfang, Ren Nan, Ma Mengyuan. Research on the Coupling Problem of WBS Tasks Based on FDSM Model[J]. Shanghai Management Science, 2016, 38(6): 76-79. | |
| 6 | 李永波. 基于解耦子任务的多目标跟踪方法研究[D]. 重庆: 重庆理工大学, 2022. |
| Li Yongbo. Research of Multi-object Tracking Method Based on Subtask Decoupling[D]. Chongqing: Chongqing University of Technology, 2022. | |
| 7 | 邵太华, 陈洪辉, 舒振, 等. 面向无人作战指挥控制的任务智能解析技术[J]. 指挥与控制学报, 2021, 7(2): 146-152. |
| Shao Taihua, Chen Honghui, Shu Zhen, et al. Mission Intelligent Parsing for Unmanned Combat Command and Control[J]. Journal of Command and Control, 2021, 7(2): 146-152. | |
| 8 | 胡云鹏, 彭祺擘, 武新峰, 等. 面向MBSE的航天任务风险分析方法[J]. 网信军民融合, 2022(增2): 23-29. |
| 9 | 罗海龙, 赵得智, 王皓. 面向服务的跨域协同作战任务效费分析[J]. 军事运筹与评估, 2022, 37(3): 57-63. |
| Luo Hailong, Zhao Dezhi, Wang Hao. Efficiency-cost Analysis of Cross-domain Coordinated Operations Based on Service-oriented Architecture[J]. Military Operations Research and Assessments, 2022, 37(3): 57-63. | |
| 10 | 彭鹏菲, 龚雪, 郑雅莲, 等. 基于模拟退火与强化学习机制的任务分析方法[J]. 兵器装备工程学报, 2022, 43(9): 315-322. |
| Peng Pengfei, Gong Xue, Zheng Yalian, et al. Task Analysis Approach Based on Simulated Annealing and Reinforcement Learning Mechanisms[J]. Journal of Ordnance Equipment Engineering, 2022, 43(9): 315-322. | |
| 11 | Ren Jing, Huang Xishi, Huang R N. Efficient Deep Reinforcement Learning for Optimal Path Planning[J]. Electronics, 2022, 11(21): 3628. |
| 12 | 王积旺, 沈立炜. 面向多机器人环境中动态异构任务的细粒度动作分配与调度方法[J]. 计算机科学, 2023, 50(2): 244-253. |
| Wang Jiwang, Shen Liwei. Fine-grained Action Allocation and Scheduling Method for Dynamic Heterogeneous Tasks in Multi-robot Environments[J]. Computer Science, 2023, 50(2): 244-253. | |
| 13 | 朱涛, 梁维泰, 黄松华, 等. 面向任务的网络信息体系建模分析方法研究[J]. 系统仿真学报, 2020, 32(4): 727-737. |
| Zhu Tao, Liang Weitai, Huang Songhua, et al. Research on Modeling and Analyzing Method of Task-oriented Network Information System of Systems[J]. Journal of System Simulation, 2020, 32(4): 727-737. | |
| 14 | Al Younes Y, Barczyk M. Adaptive Nonlinear Model Predictive Horizon Using Deep Reinforcement Learning for Optimal Trajectory Planning[J]. Drones, 2022, 6(11): 323. |
| 15 | 李龙跃, 刘付显, 赵慧珍. 弹道导弹防御M/M/N排队系统建模与仿真[J]. 系统仿真学报, 2018, 30(4): 1260-1271. |
| Li Longyue, Liu Fuxian, Zhao Huizhen. Modeling and Simulation of Missile Defense M/M/N Queueing System[J]. Journal of System Simulation, 2018, 30(4): 1260-1271. | |
| 16 | 李佳炜, 江晶, 刘重阳, 等. 弹道导弹目标群轨迹建模与仿真[J]. 系统仿真学报, 2020, 32(8): 1515-1523. |
| Li Jiawei, Jiang Jing, Liu Chongyang, et al. Modeling and Simulation for Target Complex Trajectory of Ballistic Missile[J]. Journal of System Simulation, 2020, 32(8): 1515-1523. | |
| 17 | 吴帅, 周晓华, 汪莉莉, 等. 基于实际采样的导弹弹道建模与仿真[J]. 系统仿真学报, 2019, 31(4): 811-817. |
| Wu Shuai, Zhou Xiaohua, Wang Lili, et al. Modeling and Simulation of Missile Trajectory Based on Practical Sampling[J]. Journal of System Simulation, 2019, 31(4): 811-817. | |
| 18 | 王伟, 刘付显. 基于任务关系矩阵的作战任务分解优化[J]. 军事运筹与系统工程, 2017, 31(4): 9-14. |
| 19 | 董涛, 刘付显, 杜菲菲, 等. 基于矩阵的作战任务建模及重组[J]. 工程数学学报, 2013, 30(5): 633-641. |
| Dong Tao, Liu Fuxian, Du Feifei, et al. Modeling and Reengineering for Anti-TBM Operational Task Based on Matrix[J]. Chinese Journal of Engineering Mathematics, 2013, 30(5): 633-641. | |
| 20 | 马悦, 吴琳, 许霄, 等. 智能化作战任务规划需求分析[J]. 指挥控制与仿真, 2021, 43(4): 61-67. |
| Ma Yue, Wu Lin, Xu Xiao, et al. Requirement Analysis of Intelligent Operation Task Planning[J]. Command Control & Simulation, 2021, 43(4): 61-67. | |
| 21 | 王小康, 冀杰, 刘洋, 等. 基于改进Q学习算法的无人物流配送车路径规划[J]. 系统仿真学报, 2024, 36(5): 1211-1221. |
| Wang Xiaokang, Ji Jie, Liu Yang, et al. Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm[J]. Journal of System Simulation, 2024, 36(5): 1211-1221. | |
| 22 | 胡鹤轩, 钱泽宇, 胡强, 等. 离散四水库问题基准下基于n步Q-learning的水库群优化调度[J]. 中国水利水电科学研究院学报(中英文), 2023, 21(2): 138-147. |
| Hu Hexuan, Qian Zeyu, Hu Qiang, et al. Optimal Scheduling of Multi-reservoir System Based on N-step Q-learning Under Discrete Four-reservoir Problem Benchmark[J]. Journal of China Institute of Water Resources and Hydropower Research, 2023, 21(2): 138-147. | |
| 23 | 唐斯琪, 潘志松, 胡谷雨, 等. 深度强化学习在天基信息网络中的应用-现状与前景[J]. 系统工程与电子技术, 2023, 45(3): 886-901. |
| Tang Siqi, Pan Zhisong, Hu Guyu, et al. Application of Deep Reinforcement Learning in Space Information Network-status Quo and Prospects[J]. Systems Engineering and Electronics, 2023, 45(3): 886-901. | |
| 24 | 宋健, 王子磊. 基于值分解的多目标多智能体深度强化学习方法[J]. 计算机工程, 2023, 49(1): 31-40. |
| Song Jian, Wang Zilei. Multi-goal Multi-agent Deep Reinforcement Learning Method Based on Value Decomposition[J]. Computer Engineering, 2023, 49(1): 31-40. | |
| 25 | Zhou Zhiqian, Zhu Pengming, Zeng Zhiwen, et al. Robot Navigation in a Crowd by Integrating Deep Reinforcement Learning and Online Planning[J]. Applied Intelligence, 2022, 52(13): 15600-15616. |
| 26 | 倪郑鸿远. 强化学习的内在奖励优化方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021. |
| Ni Zhenghongyuan. Research on Intrinsic Reward Optimization Method of Reinforcement Learning[D]. Harbin: Harbin Institute of Technology, 2021. | |
| 27 | 于航. 基于深度强化学习的多智能体协作学习算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021. |
| Yu Hang. Research on Multi-agent Cooperative Learning Based on Deep Reinforcement Learning[D]. Harbin: Harbin Institute of Technology, 2021. | |
| 28 | 闫超, 相晓嘉, 徐昕, 等. 多智能体深度强化学习及其可扩展性与可迁移性研究综述[J]. 控制与决策, 2022, 37(12): 3083-3102. |
| Yan Chao, Xiang Xiaojia, Xu Xin, et al. A Survey on Scalability and Transferability of Multi-agent Deep Reinforcement Learning[J]. Control and Decision, 2022, 37(12): 3083-3102. | |
| 29 | Mnih V, Kavukcuoglu K, Silver D, et al. Human-level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533. |
| 30 | 王锦, 张新有. 基于DQN的无人驾驶任务卸载策略[J]. 计算机应用研究, 2022, 39(9): 2738-2744. |
| Wang Jin, Zhang Xinyou. DQN-based Driverless Task Offloading Policy[J]. Application Research of Computers, 2022, 39(9): 2738-2744. | |
| 31 | 刘森, 李玺, 黄运. 基于改进DQN算法的NPC行进路线规划研究[J]. 无线电工程, 2022, 52(8): 1441-1446. |
| Liu Sen, Li Xi, Huang Yun. Research on Marching Route Planning of NPC Based on Improved DQN Algorithm[J]. Radio Engineering, 2022, 52(8): 1441-1446. | |
| 32 | 白辰甲, 刘鹏, 赵巍, 等. 基于TD-error自适应校正的深度Q学习主动采样方法[J]. 计算机研究与发展, 2019, 56(2): 262-280. |
| Bai Chenjia, Liu Peng, Zhao Wei, et al. Active Sampling for Deep Q-Learning Based on TD-error Adaptive Correction[J]. Journal of Computer Research and Development, 2019, 56(2): 262-280. | |
| 33 | 吴雨桐. 产品协同设计任务的排序与调度问题研究[D]. 太原: 太原科技大学, 2017. |
| Wu Yutong. Study on Task Scheduling and Dispatch in Collaborative Product Development[D]. Taiyuan: Taiyuan University of Science and Technology, 2017. |
| [1] | 江明, 何韬. 基于深度强化学习的带容量约束车辆路径问题求解[J]. 系统仿真学报, 2025, 37(9): 2177-2187. |
| [2] | 倪培龙, 毛鹏军, 王宁, 杨孟杰. 基于改进A-DDQN算法的机器人路径规划[J]. 系统仿真学报, 2025, 37(9): 2420-2430. |
| [3] | 谢勇, 高海龙, 陈于涛, 王焕江. 动态需求情形下多行程多交货期的成品油配送优化[J]. 系统仿真学报, 2025, 37(8): 2016-2029. |
| [4] | 陈真, 吴卓屹, 张霖. 深度强化学习中策略表征研究简述[J]. 系统仿真学报, 2025, 37(7): 1753-1769. |
| [5] | 谷学强, 罗俊仁, 周棪忠, 张万鹏. 智能博弈决策大模型智能体技术综述[J]. 系统仿真学报, 2025, 37(5): 1142-1157. |
| [6] | 伍国华, 曾家恒, 王得志, 郑龙, 邹伟. 基于深度强化学习的四旋翼航迹跟踪控制方法[J]. 系统仿真学报, 2025, 37(5): 1169-1187. |
| [7] | 许明, 李金烨, 左东宇, 张晶. 基于流量预测的信号灯配时优化强化学习方法[J]. 系统仿真学报, 2025, 37(4): 1051-1062. |
| [8] | 张森, 代强强. 改进型深度确定性策略梯度的无人机路径规划[J]. 系统仿真学报, 2025, 37(4): 875-881. |
| [9] | 李敏, 张森, 曾祥光, 王刚, 张童伟, 谢地杰, 任文哲, 张滔. 基于深度强化学习的四足机器人单腿越障轨迹规划[J]. 系统仿真学报, 2025, 37(4): 895-909. |
| [10] | 王欣, 崔承刚, 王想想, 朱平. 基于安全强化学习的热电联产机组经济调度策略研究[J]. 系统仿真学报, 2025, 37(4): 968-981. |
| [11] | 张雷, 张雪超, 王超, 薄祥雷. 基于在线强化学习算法的救护车智能调控模型[J]. 系统仿真学报, 2025, 37(3): 584-594. |
| [12] | 王贺, 许佳宁, 闫广宇. 基于深度强化学习的AGV行人避让策略研究[J]. 系统仿真学报, 2025, 37(3): 595-606. |
| [13] | 张斌, 雷永林, 李群, 高远, 陈永, 朱佳俊, 鲍琛龙. 基于强化学习的导弹突防决策建模研究[J]. 系统仿真学报, 2025, 37(3): 763-774. |
| [14] | 黄思进, 文佳, 陈哲毅. 面向边缘车联网系统的智能服务迁移方法[J]. 系统仿真学报, 2025, 37(2): 379-391. |
| [15] | 费帅迪, 蔡长龙, 刘飞, 陈明晖, 刘晓明. 舰船防空反导的目标分配方法研究[J]. 系统仿真学报, 2025, 37(2): 508-516. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||