系统仿真学报 ›› 2024, Vol. 36 ›› Issue (7): 1670-1681.doi: 10.16182/j.issn1004731x.joss.23-0443
• 研究论文 • 上一篇
收稿日期:
2023-04-14
修回日期:
2023-06-01
出版日期:
2024-07-15
发布日期:
2024-07-12
通讯作者:
荣里
E-mail:gogxue@163.com;33574319@qq.com
第一作者简介:
龚雪(1998-),女,硕士生,研究方向为人工智能与大数据。Email:gogxue@163.com
基金资助:
Gong Xue1(), Peng Pengfei1, Rong Li1(
), Zheng Yalian2, Jiang Jun1
Received:
2023-04-14
Revised:
2023-06-01
Online:
2024-07-15
Published:
2024-07-12
Contact:
Rong Li
E-mail:gogxue@163.com;33574319@qq.com
摘要:
针对任务分析中任务协同交互耦合度高、影响因素繁多等问题,提出了基于序列解耦与深度强化学习的任务分析方法,实现了复杂约束条件下的任务分解及任务序列重构。设计了基于任务信息交互的深度强化学习环境,基于目标网络与评估网络损失函数间的差值改进
中图分类号:
龚雪,彭鹏菲,荣里等 . 基于深度强化学习的任务分析方法[J]. 系统仿真学报, 2024, 36(7): 1670-1681.
Gong Xue,Peng Pengfei,Rong Li,et al . Task Analysis Methods Based on Deep Reinforcement Learning[J]. Journal of System Simulation, 2024, 36(7): 1670-1681.
1 | 马悦, 吴琳, 刘昀, 等. 作战任务优选建模及求解方法研究[J]. 系统仿真学报, 2023, 35(3): 470-483. |
Ma Yue, Wu Lin, Liu Yun, et al. Research on Modeling and Solution Method of Operational Tasks Optimization[J]. Journal of System Simulation, 2023, 35(3): 470-483. | |
2 | 贾正荣, 卢发兴, 王航宇. 基于解耦优化和环流APF的多平台协同攻击任务规划[J]. 北京航空航天大学学报, 2020, 46(6): 1142-1150. |
Jia Zhengrong, Lu Faxing, Wang Hangyu. Multi-platform Cooperative Task Planning with Decoupling Optimization and Circulating APF[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(6): 1142-1150. | |
3 | 王晨旭, 王晓晨, 余敦辉, 等. 基于动态解耦的软件众包任务分解算法[J]. 计算机工程, 2019, 45(8): 120-124, 134. |
Wang Chenxu, Wang Xiaochen, Yu Dunhui, et al. Software Crowdsourcing Task Decomposition Algorithm Based on Dynamic Decoupling[J]. Computer Engineering, 2019, 45(8): 120-124, 134. | |
4 | 杨伟刚, 张永永. 2020年以来美国国民警卫队遂行任务解析[J]. 中国军转民, 2021(15): 49-50. |
5 | 吴红芳, 任南, 马梦园. 基于FDSM模型的WBS任务耦合问题的研究[J]. 上海管理科学, 2016, 38(6): 76-79. |
Wu Hongfang, Ren Nan, Ma Mengyuan. Research on the Coupling Problem of WBS Tasks Based on FDSM Model[J]. Shanghai Management Science, 2016, 38(6): 76-79. | |
6 | 李永波. 基于解耦子任务的多目标跟踪方法研究[D]. 重庆: 重庆理工大学, 2022. |
Li Yongbo. Research of Multi-object Tracking Method Based on Subtask Decoupling[D]. Chongqing: Chongqing University of Technology, 2022. | |
7 | 邵太华, 陈洪辉, 舒振, 等. 面向无人作战指挥控制的任务智能解析技术[J]. 指挥与控制学报, 2021, 7(2): 146-152. |
Shao Taihua, Chen Honghui, Shu Zhen, et al. Mission Intelligent Parsing for Unmanned Combat Command and Control[J]. Journal of Command and Control, 2021, 7(2): 146-152. | |
8 | 胡云鹏, 彭祺擘, 武新峰, 等. 面向MBSE的航天任务风险分析方法[J]. 网信军民融合, 2022(增2): 23-29. |
9 | 罗海龙, 赵得智, 王皓. 面向服务的跨域协同作战任务效费分析[J]. 军事运筹与评估, 2022, 37(3): 57-63. |
Luo Hailong, Zhao Dezhi, Wang Hao. Efficiency-cost Analysis of Cross-domain Coordinated Operations Based on Service-oriented Architecture[J]. Military Operations Research and Assessments, 2022, 37(3): 57-63. | |
10 | 彭鹏菲, 龚雪, 郑雅莲, 等. 基于模拟退火与强化学习机制的任务分析方法[J]. 兵器装备工程学报, 2022, 43(9): 315-322. |
Peng Pengfei, Gong Xue, Zheng Yalian, et al. Task Analysis Approach Based on Simulated Annealing and Reinforcement Learning Mechanisms[J]. Journal of Ordnance Equipment Engineering, 2022, 43(9): 315-322. | |
11 | Ren Jing, Huang Xishi, Huang R N. Efficient Deep Reinforcement Learning for Optimal Path Planning[J]. Electronics, 2022, 11(21): 3628. |
12 | 王积旺, 沈立炜. 面向多机器人环境中动态异构任务的细粒度动作分配与调度方法[J]. 计算机科学, 2023, 50(2): 244-253. |
Wang Jiwang, Shen Liwei. Fine-grained Action Allocation and Scheduling Method for Dynamic Heterogeneous Tasks in Multi-robot Environments[J]. Computer Science, 2023, 50(2): 244-253. | |
13 | 朱涛, 梁维泰, 黄松华, 等. 面向任务的网络信息体系建模分析方法研究[J]. 系统仿真学报, 2020, 32(4): 727-737. |
Zhu Tao, Liang Weitai, Huang Songhua, et al. Research on Modeling and Analyzing Method of Task-oriented Network Information System of Systems[J]. Journal of System Simulation, 2020, 32(4): 727-737. | |
14 | Al Younes Y, Barczyk M. Adaptive Nonlinear Model Predictive Horizon Using Deep Reinforcement Learning for Optimal Trajectory Planning[J]. Drones, 2022, 6(11): 323. |
15 | 李龙跃, 刘付显, 赵慧珍. 弹道导弹防御M/M/N排队系统建模与仿真[J]. 系统仿真学报, 2018, 30(4): 1260-1271. |
Li Longyue, Liu Fuxian, Zhao Huizhen. Modeling and Simulation of Missile Defense M/M/N Queueing System[J]. Journal of System Simulation, 2018, 30(4): 1260-1271. | |
16 | 李佳炜, 江晶, 刘重阳, 等. 弹道导弹目标群轨迹建模与仿真[J]. 系统仿真学报, 2020, 32(8): 1515-1523. |
Li Jiawei, Jiang Jing, Liu Chongyang, et al. Modeling and Simulation for Target Complex Trajectory of Ballistic Missile[J]. Journal of System Simulation, 2020, 32(8): 1515-1523. | |
17 | 吴帅, 周晓华, 汪莉莉, 等. 基于实际采样的导弹弹道建模与仿真[J]. 系统仿真学报, 2019, 31(4): 811-817. |
Wu Shuai, Zhou Xiaohua, Wang Lili, et al. Modeling and Simulation of Missile Trajectory Based on Practical Sampling[J]. Journal of System Simulation, 2019, 31(4): 811-817. | |
18 | 王伟, 刘付显. 基于任务关系矩阵的作战任务分解优化[J]. 军事运筹与系统工程, 2017, 31(4): 9-14. |
19 | 董涛, 刘付显, 杜菲菲, 等. 基于矩阵的作战任务建模及重组[J]. 工程数学学报, 2013, 30(5): 633-641. |
Dong Tao, Liu Fuxian, Du Feifei, et al. Modeling and Reengineering for Anti-TBM Operational Task Based on Matrix[J]. Chinese Journal of Engineering Mathematics, 2013, 30(5): 633-641. | |
20 | 马悦, 吴琳, 许霄, 等. 智能化作战任务规划需求分析[J]. 指挥控制与仿真, 2021, 43(4): 61-67. |
Ma Yue, Wu Lin, Xu Xiao, et al. Requirement Analysis of Intelligent Operation Task Planning[J]. Command Control & Simulation, 2021, 43(4): 61-67. | |
21 | 王小康, 冀杰, 刘洋, 等. 基于改进Q学习算法的无人物流配送车路径规划[J]. 系统仿真学报, 2024, 36(5): 1211-1221. |
Wang Xiaokang, Ji Jie, Liu Yang, et al. Path Planning of Unmanned Delivery Vehicle Based on Improved Q-learning Algorithm[J]. Journal of System Simulation, 2024, 36(5): 1211-1221. | |
22 | 胡鹤轩, 钱泽宇, 胡强, 等. 离散四水库问题基准下基于n步Q-learning的水库群优化调度[J]. 中国水利水电科学研究院学报(中英文), 2023, 21(2): 138-147. |
Hu Hexuan, Qian Zeyu, Hu Qiang, et al. Optimal Scheduling of Multi-reservoir System Based on N-step Q-learning Under Discrete Four-reservoir Problem Benchmark[J]. Journal of China Institute of Water Resources and Hydropower Research, 2023, 21(2): 138-147. | |
23 | 唐斯琪, 潘志松, 胡谷雨, 等. 深度强化学习在天基信息网络中的应用-现状与前景[J]. 系统工程与电子技术, 2023, 45(3): 886-901. |
Tang Siqi, Pan Zhisong, Hu Guyu, et al. Application of Deep Reinforcement Learning in Space Information Network-status Quo and Prospects[J]. Systems Engineering and Electronics, 2023, 45(3): 886-901. | |
24 | 宋健, 王子磊. 基于值分解的多目标多智能体深度强化学习方法[J]. 计算机工程, 2023, 49(1): 31-40. |
Song Jian, Wang Zilei. Multi-goal Multi-agent Deep Reinforcement Learning Method Based on Value Decomposition[J]. Computer Engineering, 2023, 49(1): 31-40. | |
25 | Zhou Zhiqian, Zhu Pengming, Zeng Zhiwen, et al. Robot Navigation in a Crowd by Integrating Deep Reinforcement Learning and Online Planning[J]. Applied Intelligence, 2022, 52(13): 15600-15616. |
26 | 倪郑鸿远. 强化学习的内在奖励优化方法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021. |
Ni Zhenghongyuan. Research on Intrinsic Reward Optimization Method of Reinforcement Learning[D]. Harbin: Harbin Institute of Technology, 2021. | |
27 | 于航. 基于深度强化学习的多智能体协作学习算法研究[D]. 哈尔滨: 哈尔滨工业大学, 2021. |
Yu Hang. Research on Multi-agent Cooperative Learning Based on Deep Reinforcement Learning[D]. Harbin: Harbin Institute of Technology, 2021. | |
28 | 闫超, 相晓嘉, 徐昕, 等. 多智能体深度强化学习及其可扩展性与可迁移性研究综述[J]. 控制与决策, 2022, 37(12): 3083-3102. |
Yan Chao, Xiang Xiaojia, Xu Xin, et al. A Survey on Scalability and Transferability of Multi-agent Deep Reinforcement Learning[J]. Control and Decision, 2022, 37(12): 3083-3102. | |
29 | Mnih V, Kavukcuoglu K, Silver D, et al. Human-level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533. |
30 | 王锦, 张新有. 基于DQN的无人驾驶任务卸载策略[J]. 计算机应用研究, 2022, 39(9): 2738-2744. |
Wang Jin, Zhang Xinyou. DQN-based Driverless Task Offloading Policy[J]. Application Research of Computers, 2022, 39(9): 2738-2744. | |
31 | 刘森, 李玺, 黄运. 基于改进DQN算法的NPC行进路线规划研究[J]. 无线电工程, 2022, 52(8): 1441-1446. |
Liu Sen, Li Xi, Huang Yun. Research on Marching Route Planning of NPC Based on Improved DQN Algorithm[J]. Radio Engineering, 2022, 52(8): 1441-1446. | |
32 | 白辰甲, 刘鹏, 赵巍, 等. 基于TD-error自适应校正的深度Q学习主动采样方法[J]. 计算机研究与发展, 2019, 56(2): 262-280. |
Bai Chenjia, Liu Peng, Zhao Wei, et al. Active Sampling for Deep Q-Learning Based on TD-error Adaptive Correction[J]. Journal of Computer Research and Development, 2019, 56(2): 262-280. | |
33 | 吴雨桐. 产品协同设计任务的排序与调度问题研究[D]. 太原: 太原科技大学, 2017. |
Wu Yutong. Study on Task Scheduling and Dispatch in Collaborative Product Development[D]. Taiyuan: Taiyuan University of Science and Technology, 2017. |
[1] | 孙怡峰, 李智, 吴疆, 王玉宾. 作战方案驱动的可学习兵棋推演智能体研究[J]. 系统仿真学报, 2024, 36(7): 1525-1535. |
[2] | 蒋权, 魏静萱. 用于动态柔性作业车间调度的实时调度方法[J]. 系统仿真学报, 2024, 36(7): 1609-1620. |
[3] | 周志勇, 莫非, 赵凯, 郝云波, 钱宇峰. 基于PPO的自适应PID控制算法研究[J]. 系统仿真学报, 2024, 36(6): 1425-1432. |
[4] | 王红军, 林俊强, 邹湘军, 张坡, 周铭轩, 邹伟锐, 唐昀超, 罗陆锋. 基于数字孪生的果园虚拟交互系统构建[J]. 系统仿真学报, 2024, 36(6): 1493-1508. |
[5] | 王远, 徐琳, 宫小泽, 张永亮, 王永利. 基于梯度的深度强化学习解释方法[J]. 系统仿真学报, 2024, 36(5): 1130-1140. |
[6] | 秦保新, 张羽霄, 吴思锐, 曹卫冲, 李湛. 基于改进D3QN的煤炭码头卸车排产智能优化方法[J]. 系统仿真学报, 2024, 36(3): 770-781. |
[7] | 赵莹莹, 董普森, 朱天晨, 李凡, 苏运, 邰振赢, 孙庆赟, 凡航. 面向电网拓扑调度仿真的采样效率优化方法研究[J]. 系统仿真学报, 2024, 36(2): 283-295. |
[8] | 王鑫鹏, 傅汇乔, 邓归洲, 唐开强, 陈春林, 留沧海. 基于DRL和自由步态的六足机器人运动规划研究[J]. 系统仿真学报, 2024, 36(2): 373-384. |
[9] | 潘海南, 陈柏良, 黄开宏, 任君凯, 程创, 卢惠民, 张辉. 基于深度强化学习的履带机器人摆臂控制方法[J]. 系统仿真学报, 2024, 36(2): 405-414. |
[10] | 张国辉, 高昂, 张雅楠. 基于RLoMAG+EAS的同构集群装备体系作战效能评估方法[J]. 系统仿真学报, 2024, 36(1): 160-169. |
[11] | 安靖, 司光亚, 张雷. 基于深度强化学习的立体投送策略优化方法研究[J]. 系统仿真学报, 2024, 36(1): 39-49. |
[12] | 郭润夏, 王一府. 以维修间隔利用率最优为目标的飞机派遣方法[J]. 系统仿真学报, 2023, 35(9): 1985-1999. |
[13] | 林俊强, 王红军, 邹湘军, 张坡, 李承恩, 周益鹏, 姚书杰. 基于DPPO的移动采摘机器人避障路径规划及仿真[J]. 系统仿真学报, 2023, 35(8): 1692-1704. |
[14] | 刘家义, 王刚, 付强, 郭相科, 王思远. 基于分配策略优化算法的智能防空任务分配[J]. 系统仿真学报, 2023, 35(8): 1705-1716. |
[15] | 杨来义, 毕敬, 苑海涛. 基于SAC算法的移动机器人智能路径规划[J]. 系统仿真学报, 2023, 35(8): 1726-1736. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||