动态需求情形下多行程多交货期的成品油配送优化

doi:10.16182/j.issn1004731x.joss.0124-0243

摘要/Abstract

摘要：

在动态需求情形下，综合考虑订单交货期、车辆运输时间窗等因素，以最大化配送收益为目标，建立了多行程多交货期的成品油周期性配送优化模型，并设计了基于强化学习的大邻域搜索算法进行求解。基于前向插入启发式算法构造初始解；设计了面向邻域算子选择的深度强化学习模型，通过双深度Q网络拟合动作价值函数，以选择最优的邻域操作算子，获得最优配送方案。实验结果表明：基于强化学习的大邻域搜索算法能够在保证求解质量的同时有效提升求解速度。

关键词: 成品油配送, 动态需求, 多行程, 多交货期, 强化学习

Abstract:

In the case of dynamic demand, considering the order due date, vehicle transportation time window, and other factors, this paper developed an optimization model of periodic product oil distribution with multiple trips and multiple due dates to maximize the distribution revenue. The paper also designed a reinforcement learning-based large neighborhood search algorithm to solve the problem. The initial solution was constructed based on the forward insertion heuristic algorithm. Then, a deep reinforcement learning model for neighborhood operator selection was designed. By fitting the action value function through the double deep Q network, the optimal neighborhood operator was selected, and the optimal distribution scheme was obtained. The experimental results show that the large neighborhood search algorithm based on reinforcement learning proposed in this paper can effectively improve the solving speed while ensuring the solution's quality.

Key words: product oil distribution, dynamic demand, multiple trips, multiple due dates, RL

中图分类号:

TP18

谢勇,高海龙,陈于涛等 . 动态需求情形下多行程多交货期的成品油配送优化[J]. 系统仿真学报, 2025, 37(8): 2016-2029.

Xie Yong,Gao Hailong,Chen Yutao,et al . Optimization of Product Oil Distribution with Multiple Trips and Multiple Due Dates under Dynamic Demand[J]. Journal of System Simulation, 2025, 37(8): 2016-2029.

图/表 11

图1

图2

图3

表1

图4

表2

表3

神经网络训练超参数设置

超参数	值
训练轮数 $M$	1 000
经验池容量 $N$	10 000
回放周期 $K$	256
采样批量 $k$	256
步长 $η$	0.001
折扣因子 $γ$	0.95
片段长度 $T$	500
目标网络更新间隔 $C$	200
优先经验回放指数 $α$	0.6
优先经验回放指数 $β$	0.4
初始贪婪因子 $ε I$	0.5
最终贪婪因子 $ε E$	0.1
探索衰减因子 $λ$	0.995

表3

图5

表4

表5

表6

参考文献 31

[1]	马义飞, 孙晓燕. 成品油二次配送调度优化模型及其遗传算法求解[J]. 运筹与管理, 2010, 19(6): 73-78.
	Ma Yifei, Sun Xiaoyan. Dispatching Optimization Model of Second Distribution of Gasolin & Diesel Oil and Solution Based on Genetic Algorithm[J]. Operations Research and Management Science, 2010, 19(6): 73-78.
[2]	杨雅光. 成品油二次配送环节损耗治理方案[J]. 油气储运, 2015, 34(1): 57-61.
	Yang Yaguang. Prevention of Loss in the Secondary Distribution of Products Oil[J]. Oil & Gas Storage and Transportation, 2015, 34(1): 57-61.
[3]	孙丽华. 石化企业成品油物流优化信息化建设探析[J]. 计算机与应用化学, 2012, 29(5): 620-624.
	Sun Lihua. Study on the Logistics Optimization Informationization of Oil Products in Petrochemical Enterprises[J]. Computers and Applied Chemistry, 2012, 29(5): 620-624.
[4]	Psaraftis H N. Dynamic Vehicle Routing Problems[J]. Vehicle Routing: Methods and Studies, 1988, 16: 223-248.
[5]	Lund K, Madsen O B G, Rygaard J M. Vehicle Routing Problems with Varying Degrees of Dynamism[R]. IMM, Institute of Mathematical Modelling, Technical University of Denmark, Kongens Lyngby, Denmark 1996.
[6]	Haghani A, Jung S. A Dynamic Vehicle Routing Problem with Time-dependent Travel Times[J]. Computers & Operations Research, 2005, 32(11): 2959-2986.
[7]	Azi Nabila, Gendreau Michel, Potvin Jean-Yves. A Dynamic Vehicle Routing Problem with Multiple Delivery Routes[J]. Annals of Operations Research, 2012, 199(1): 103-112.
[8]	Khouadjia Mostepha R, Sarasola Briseida, Alba Enrique, et al. A Comparative Study Between Dynamic Adapted PSO and VNS for the Vehicle Routing Problem with Dynamic Requests[J]. Applied Soft Computing, 2012, 12(4): 1426-1439.
[9]	Su Yansen, Liu Jia, Xiang Xiaoshu, et al. A Responsive Ant Colony Optimization for Large-scale Dynamic Vehicle Routing Problems via Pheromone Diversity Enhancement[J]. Complex & Intelligent Systems, 2021, 7(5): 2543-2558.
[10]	Yu Jianqiao, Yu Wen, Gu Jiatao. Online Vehicle Routing with Neural Combinatorial Optimization and Deep Reinforcement Learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(10): 3806-3817.
[11]	Li Jingwen, Ma Yining, Gao Ruize, et al. Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem[J]. IEEE Transactions on Cybernetics, 2022, 52(12): 13572-13585.
[12]	Joe W, Lau H C. Deep Reinforcement Learning Approach to Solve Dynamic Vehicle Routing Problem with Stochastic Customers[C]//Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling. Palo Alto: AAAI Press, 2020: 394-402.
[13]	Chen Xinyun, Tian Yuandong. Learning to Perform Local Rewriting for Combinatorial Optimization[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 6281-6292.
[14]	Wu Yaoxin, Song Wen, Cao Zhiguang, et al. Learning Improvement Heuristics for Solving Routing Problems[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(9): 5057-5069.
[15]	Zhao Jiuxia, Mao Minjia, Zhao Xi, et al. A Hybrid of Deep Reinforcement Learning and Local Search for the Vehicle Routing Problems[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(11): 7208-7218.
[16]	李珍萍, 杨光, 韩倩倩. 考虑工作量均衡的成品油二次配送车辆路径问题[J]. 系统仿真学报, 2022, 34(2): 221-233.
	Li Zhenping, Yang Guang, Han Qianqian. Vehicle Routing Problem with Refined Oil Secondary Distribution Considering Workload Balance[J]. Journal of System Simulation, 2022, 34(2): 221-233.
[17]	Wang L, Kinable J, van Woensel T. The Fuel Replenishment Problem: A Split-delivery Multi-compartment Vehicle Routing Problem with Multiple Trips[J]. Computers & Operations Research, 2020, 118: 104904.
[18]	王旭坪, 詹红鑫, 孙自来, 等. 多行程带补货时间窗的成品油多舱配送路径优化[J]. 管理工程学报, 2020, 34(4): 182-195.
	Wang Xuping, Zhan Hongxin, Sun Zilai, et al. Optimization of Routes for Multi-compartment, Multi-trip Refined Oil Distribution with Replenishment Time[J]. Journal of Industrial Engineering and Engineering Management, 2020, 34(4): 182-195.
[19]	Liu Qian, Wang Lianhua, Yu Le. Research on Refined Oil Distribution Plan Based on Dynamic Time Window[J]. Journal of Applied Mathematics and Physics, 2017, 5(11): 2104-2111.
[20]	Xu Xiaofeng, Lin Ziru, Zhu Jing. DVRP with Limited Supply and Variable Neighborhood Region in Refined Oil Distribution[J]. Annals of Operations Research, 2022, 309(2): 663-687.
[21]	Li Zhenping, Zhang Yuwei, Zhang Guowei. Two-stage Stochastic Programming for the Refined Oil Secondary Distribution with Uncertain Demand and Limited Inventory Capacity[J]. IEEE Access, 2020, 8: 119487-119500.
[22]	马向国, 刘同娟, 杨平哲, 等. 基于随机需求的冷链物流车辆路径优化模型[J]. 系统仿真学报, 2016, 28(8): 1824-1832, 1840.
	Ma Xiangguo, Liu Tongjuan, Yang Pingzhe, et al. Vehicle Routing Optimization Model of Cold Chain Logistics Based on Stochastic Demand[J]. Journal of System Simulation, 2016, 28(8): 1824-1832, 1840.
[23]	南丽君, 陈彦如, 张宗成. 改进的自适应大规模邻域搜索算法求解动态需求的混合车辆路径问题[J]. 计算机应用研究, 2021, 38(10): 2926-2934.
	Lijun Nan, Chen Yanru, Zhang Zongcheng. Improved Adaptive Large Neighborhood Search Algorithm for Mixed Fleet Routing Problem of Dynamic Demands[J]. Application Research of Computers, 2021, 38(10): 2926-2934.
[24]	孙宝凤, 史俊妍, 杨雪, 等. 基于实时信息的取送货动态车辆路径问题研究[J]. 宁波大学学报(理工版), 2019, 32(3): 87-94.
	Sun Baofeng, Shi Junyan, Yang Xue, et al. Solution for Dynamic Pickup and Delivery Problem Based on Real-time Information[J]. Journal of Ningbo University(Natural Science & Engineering Edition), 2019, 32(3): 87-94.
[25]	范双南, 陈纪铭, 高为民, 等. 基于改进智能水滴算法的动态车辆配送路径优化[J]. 系统仿真学报, 2020, 32(9): 1808-1817.
	Fan Shuangnan, Chen Jiming, Gao Weimin, et al. Dynamic Vehicle Distribution Path Optimization Based on Improved Intelligent Water Drop Algorithm[J]. Journal of System Simulation, 2020, 32(9): 1808-1817.
[26]	Lu Zhou, Pu Hongming, Wang Feicheng, et al. The Expressive Power of Neural Networks: A View from the Width[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6232-6240.
[27]	高海龙, 谢勇, 马吉祥, 等. 多行程多交货期的成品油配送优化[J]. 控制与决策, 2022, 37(10): 2714-2722.
	Gao Hailong, Xie Yong, Ma Jixiang, et al. Optimization of Refined Oil Distribution with Multiple Trips and Multiple Due Time[J]. Control and Decision, 2022, 37(10): 2714-2722.
[28]	van Hasselt Hado, Guez A, Silver D. Deep Reinforcement Learning with Double Q-learning[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2094-2100.
[29]	Li Xijun, Luo Weilin, Yuan Mingxuan, et al. Learning to Optimize Industry-scale Dynamic Pickup and Delivery Problems[C]//2021 IEEE 37th International Conference on Data Engineering (ICDE). Piscataway: IEEE, 2021: 2511-2522.
[30]	Zhang Yuchang, Bai Ruibin, Qu Rong, et al. A Deep Reinforcement Learning Based Hyper-heuristic for Combinatorial Optimisation with Uncertainties[J]. European Journal of Operational Research, 2022, 300(2): 418-427.
[31]	François Véronique, Arda Yasemin, Crama Yves, et al. Large Neighborhood Search for Multi-trip Vehicle Routing[J]. European Journal of Operational Research, 2016, 255(2): 422-441.

动作编号	邻域算子组合	动作编号	邻域算子组合
动作1	RSR + GPH	动作5	LTR + GPH
动作2	RSR + RRH	动作6	LTR + RRH
动作3	WOR + GPH	动作7	DTR + GPH
动作4	WOR + RRH	动作8	DTR + RRH

加油站编号	油品编号	需求量/L	下单时刻	交货期	补货时间窗
1	1	15 000	09:00	12:00	[09:00, 16:00]
1	2	12 000	06:00	10:00
1	3	10 000	09:30	14:30
2	1	12 000	08:00	11:30	[11:00, 15:00]
2	2	10 000	06:00	12:00
2	3	13 000	07:30	13:30

算例配置	RLLNS算法		IALNS算法		ILNS算法
算例配置	总收益/元	时间/s	总收益/元	时间/s	总收益/元	时间/s
S23-O69-V5-C3-8000	31 019	10.32	29 235	28.26	28 022	31.02
S23-O69-V5-C4-8000	26 873	9.56	27 162	30.23	26 058	38.51
S23-O69-V6-C3-16000	59 016	10.91	56 178	33.26	55 193	40.62
S23-O69-V6-C4-16000	59 718	10.89	57 635	28.47	56 334	28.47
S50-O150-V12-C3-8000	72 364	18.34	72 159	72.36	70 228	76.96
S50-O150-V12-C4-8000	81 921	17.17	80 234	78.21	78 676	77.29
S50-O150-V14-C3-16000	155 298	19.33	154 289	72.33	152 798	80.61
S50-O150-V14-C4-16000	159 984	18.27	157 564	71.92	155 875	76.52

动态度/%	RLLNS算法		IALNS算法
动态度/%	平均值/元	差距/%	平均值/元	差距/%
0	33 375	0	30 268	0
15	32 692	2.05	29 569	2.31
30	32 140	3.70	28 912	4.48
45	31 588	5.35	28 139	7.03
60	30 864	7.52	27 134	10.35
75	30 122	9.75	26 331	13.01
90	29 616	11.26	25 237	16.62

方案更新周期/h	配送收益平均值/元	差距/%	断供惩罚成本平均值	差距/%
静态	32 357	0	9 326	0
0.5	29 682	8.27	9 671	3.70
1.0	30 112	6.94	10 025	7.50
1.5	30 422	5.98	10 392	11.43
2.0	31 372	3.04	10 693	14.66
2.5	31 254	3.41	11 118	19.22
3.0	30 770	4.90	11 455	22.83