Optimization of Product Oil Distribution with Multiple Trips and Multiple Due Dates under Dynamic Demand

doi:10.16182/j.issn1004731x.joss.0124-0243

Abstract

Abstract:

In the case of dynamic demand, considering the order due date, vehicle transportation time window, and other factors, this paper developed an optimization model of periodic product oil distribution with multiple trips and multiple due dates to maximize the distribution revenue. The paper also designed a reinforcement learning-based large neighborhood search algorithm to solve the problem. The initial solution was constructed based on the forward insertion heuristic algorithm. Then, a deep reinforcement learning model for neighborhood operator selection was designed. By fitting the action value function through the double deep Q network, the optimal neighborhood operator was selected, and the optimal distribution scheme was obtained. The experimental results show that the large neighborhood search algorithm based on reinforcement learning proposed in this paper can effectively improve the solving speed while ensuring the solution's quality.

Key words: product oil distribution, dynamic demand, multiple trips, multiple due dates, RL

CLC Number:

TP18

Xie Yong, Gao Hailong, Chen Yutao, Wang Huanjiang. Optimization of Product Oil Distribution with Multiple Trips and Multiple Due Dates under Dynamic Demand[J]. Journal of System Simulation, 2025, 37(8): 2016-2029.

Figures/Tables 11

Fig. 1

Fig. 2

Fig. 3

Table 1

Fig. 4

Table 2

Table 3

Settings for hyperparameters of neural network training

超参数	值
训练轮数 $M$	1 000
经验池容量 $N$	10 000
回放周期 $K$	256
采样批量 $k$	256
步长 $η$	0.001
折扣因子 $γ$	0.95
片段长度 $T$	500
目标网络更新间隔 $C$	200
优先经验回放指数 $α$	0.6
优先经验回放指数 $β$	0.4
初始贪婪因子 $ε I$	0.5
最终贪婪因子 $ε E$	0.1
探索衰减因子 $λ$	0.995

Table 3

Fig. 5

Table 4

Table 5

Table 6

References 31

[1]	马义飞, 孙晓燕. 成品油二次配送调度优化模型及其遗传算法求解[J]. 运筹与管理, 2010, 19(6): 73-78.
	Ma Yifei, Sun Xiaoyan. Dispatching Optimization Model of Second Distribution of Gasolin & Diesel Oil and Solution Based on Genetic Algorithm[J]. Operations Research and Management Science, 2010, 19(6): 73-78.
[2]	杨雅光. 成品油二次配送环节损耗治理方案[J]. 油气储运, 2015, 34(1): 57-61.
	Yang Yaguang. Prevention of Loss in the Secondary Distribution of Products Oil[J]. Oil & Gas Storage and Transportation, 2015, 34(1): 57-61.
[3]	孙丽华. 石化企业成品油物流优化信息化建设探析[J]. 计算机与应用化学, 2012, 29(5): 620-624.
	Sun Lihua. Study on the Logistics Optimization Informationization of Oil Products in Petrochemical Enterprises[J]. Computers and Applied Chemistry, 2012, 29(5): 620-624.
[4]	Psaraftis H N. Dynamic Vehicle Routing Problems[J]. Vehicle Routing: Methods and Studies, 1988, 16: 223-248.
[5]	Lund K, Madsen O B G, Rygaard J M. Vehicle Routing Problems with Varying Degrees of Dynamism[R]. IMM, Institute of Mathematical Modelling, Technical University of Denmark, Kongens Lyngby, Denmark 1996.
[6]	Haghani A, Jung S. A Dynamic Vehicle Routing Problem with Time-dependent Travel Times[J]. Computers & Operations Research, 2005, 32(11): 2959-2986.
[7]	Azi Nabila, Gendreau Michel, Potvin Jean-Yves. A Dynamic Vehicle Routing Problem with Multiple Delivery Routes[J]. Annals of Operations Research, 2012, 199(1): 103-112.
[8]	Khouadjia Mostepha R, Sarasola Briseida, Alba Enrique, et al. A Comparative Study Between Dynamic Adapted PSO and VNS for the Vehicle Routing Problem with Dynamic Requests[J]. Applied Soft Computing, 2012, 12(4): 1426-1439.
[9]	Su Yansen, Liu Jia, Xiang Xiaoshu, et al. A Responsive Ant Colony Optimization for Large-scale Dynamic Vehicle Routing Problems via Pheromone Diversity Enhancement[J]. Complex & Intelligent Systems, 2021, 7(5): 2543-2558.
[10]	Yu Jianqiao, Yu Wen, Gu Jiatao. Online Vehicle Routing with Neural Combinatorial Optimization and Deep Reinforcement Learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(10): 3806-3817.
[11]	Li Jingwen, Ma Yining, Gao Ruize, et al. Deep Reinforcement Learning for Solving the Heterogeneous Capacitated Vehicle Routing Problem[J]. IEEE Transactions on Cybernetics, 2022, 52(12): 13572-13585.
[12]	Joe W, Lau H C. Deep Reinforcement Learning Approach to Solve Dynamic Vehicle Routing Problem with Stochastic Customers[C]//Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling. Palo Alto: AAAI Press, 2020: 394-402.
[13]	Chen Xinyun, Tian Yuandong. Learning to Perform Local Rewriting for Combinatorial Optimization[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2019: 6281-6292.
[14]	Wu Yaoxin, Song Wen, Cao Zhiguang, et al. Learning Improvement Heuristics for Solving Routing Problems[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(9): 5057-5069.
[15]	Zhao Jiuxia, Mao Minjia, Zhao Xi, et al. A Hybrid of Deep Reinforcement Learning and Local Search for the Vehicle Routing Problems[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(11): 7208-7218.
[16]	李珍萍, 杨光, 韩倩倩. 考虑工作量均衡的成品油二次配送车辆路径问题[J]. 系统仿真学报, 2022, 34(2): 221-233.
	Li Zhenping, Yang Guang, Han Qianqian. Vehicle Routing Problem with Refined Oil Secondary Distribution Considering Workload Balance[J]. Journal of System Simulation, 2022, 34(2): 221-233.
[17]	Wang L, Kinable J, van Woensel T. The Fuel Replenishment Problem: A Split-delivery Multi-compartment Vehicle Routing Problem with Multiple Trips[J]. Computers & Operations Research, 2020, 118: 104904.
[18]	王旭坪, 詹红鑫, 孙自来, 等. 多行程带补货时间窗的成品油多舱配送路径优化[J]. 管理工程学报, 2020, 34(4): 182-195.
	Wang Xuping, Zhan Hongxin, Sun Zilai, et al. Optimization of Routes for Multi-compartment, Multi-trip Refined Oil Distribution with Replenishment Time[J]. Journal of Industrial Engineering and Engineering Management, 2020, 34(4): 182-195.
[19]	Liu Qian, Wang Lianhua, Yu Le. Research on Refined Oil Distribution Plan Based on Dynamic Time Window[J]. Journal of Applied Mathematics and Physics, 2017, 5(11): 2104-2111.
[20]	Xu Xiaofeng, Lin Ziru, Zhu Jing. DVRP with Limited Supply and Variable Neighborhood Region in Refined Oil Distribution[J]. Annals of Operations Research, 2022, 309(2): 663-687.
[21]	Li Zhenping, Zhang Yuwei, Zhang Guowei. Two-stage Stochastic Programming for the Refined Oil Secondary Distribution with Uncertain Demand and Limited Inventory Capacity[J]. IEEE Access, 2020, 8: 119487-119500.
[22]	马向国, 刘同娟, 杨平哲, 等. 基于随机需求的冷链物流车辆路径优化模型[J]. 系统仿真学报, 2016, 28(8): 1824-1832, 1840.
	Ma Xiangguo, Liu Tongjuan, Yang Pingzhe, et al. Vehicle Routing Optimization Model of Cold Chain Logistics Based on Stochastic Demand[J]. Journal of System Simulation, 2016, 28(8): 1824-1832, 1840.
[23]	南丽君, 陈彦如, 张宗成. 改进的自适应大规模邻域搜索算法求解动态需求的混合车辆路径问题[J]. 计算机应用研究, 2021, 38(10): 2926-2934.
	Lijun Nan, Chen Yanru, Zhang Zongcheng. Improved Adaptive Large Neighborhood Search Algorithm for Mixed Fleet Routing Problem of Dynamic Demands[J]. Application Research of Computers, 2021, 38(10): 2926-2934.
[24]	孙宝凤, 史俊妍, 杨雪, 等. 基于实时信息的取送货动态车辆路径问题研究[J]. 宁波大学学报(理工版), 2019, 32(3): 87-94.
	Sun Baofeng, Shi Junyan, Yang Xue, et al. Solution for Dynamic Pickup and Delivery Problem Based on Real-time Information[J]. Journal of Ningbo University(Natural Science & Engineering Edition), 2019, 32(3): 87-94.
[25]	范双南, 陈纪铭, 高为民, 等. 基于改进智能水滴算法的动态车辆配送路径优化[J]. 系统仿真学报, 2020, 32(9): 1808-1817.
	Fan Shuangnan, Chen Jiming, Gao Weimin, et al. Dynamic Vehicle Distribution Path Optimization Based on Improved Intelligent Water Drop Algorithm[J]. Journal of System Simulation, 2020, 32(9): 1808-1817.
[26]	Lu Zhou, Pu Hongming, Wang Feicheng, et al. The Expressive Power of Neural Networks: A View from the Width[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6232-6240.
[27]	高海龙, 谢勇, 马吉祥, 等. 多行程多交货期的成品油配送优化[J]. 控制与决策, 2022, 37(10): 2714-2722.
	Gao Hailong, Xie Yong, Ma Jixiang, et al. Optimization of Refined Oil Distribution with Multiple Trips and Multiple Due Time[J]. Control and Decision, 2022, 37(10): 2714-2722.
[28]	van Hasselt Hado, Guez A, Silver D. Deep Reinforcement Learning with Double Q-learning[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2016: 2094-2100.
[29]	Li Xijun, Luo Weilin, Yuan Mingxuan, et al. Learning to Optimize Industry-scale Dynamic Pickup and Delivery Problems[C]//2021 IEEE 37th International Conference on Data Engineering (ICDE). Piscataway: IEEE, 2021: 2511-2522.
[30]	Zhang Yuchang, Bai Ruibin, Qu Rong, et al. A Deep Reinforcement Learning Based Hyper-heuristic for Combinatorial Optimisation with Uncertainties[J]. European Journal of Operational Research, 2022, 300(2): 418-427.
[31]	François Véronique, Arda Yasemin, Crama Yves, et al. Large Neighborhood Search for Multi-trip Vehicle Routing[J]. European Journal of Operational Research, 2016, 255(2): 422-441.

动作编号	邻域算子组合	动作编号	邻域算子组合
动作1	RSR + GPH	动作5	LTR + GPH
动作2	RSR + RRH	动作6	LTR + RRH
动作3	WOR + GPH	动作7	DTR + GPH
动作4	WOR + RRH	动作8	DTR + RRH

加油站编号	油品编号	需求量/L	下单时刻	交货期	补货时间窗
1	1	15 000	09:00	12:00	[09:00, 16:00]
1	2	12 000	06:00	10:00
1	3	10 000	09:30	14:30
2	1	12 000	08:00	11:30	[11:00, 15:00]
2	2	10 000	06:00	12:00
2	3	13 000	07:30	13:30

算例配置	RLLNS算法		IALNS算法		ILNS算法
算例配置	总收益/元	时间/s	总收益/元	时间/s	总收益/元	时间/s
S23-O69-V5-C3-8000	31 019	10.32	29 235	28.26	28 022	31.02
S23-O69-V5-C4-8000	26 873	9.56	27 162	30.23	26 058	38.51
S23-O69-V6-C3-16000	59 016	10.91	56 178	33.26	55 193	40.62
S23-O69-V6-C4-16000	59 718	10.89	57 635	28.47	56 334	28.47
S50-O150-V12-C3-8000	72 364	18.34	72 159	72.36	70 228	76.96
S50-O150-V12-C4-8000	81 921	17.17	80 234	78.21	78 676	77.29
S50-O150-V14-C3-16000	155 298	19.33	154 289	72.33	152 798	80.61
S50-O150-V14-C4-16000	159 984	18.27	157 564	71.92	155 875	76.52

动态度/%	RLLNS算法		IALNS算法
动态度/%	平均值/元	差距/%	平均值/元	差距/%
0	33 375	0	30 268	0
15	32 692	2.05	29 569	2.31
30	32 140	3.70	28 912	4.48
45	31 588	5.35	28 139	7.03
60	30 864	7.52	27 134	10.35
75	30 122	9.75	26 331	13.01
90	29 616	11.26	25 237	16.62

方案更新周期/h	配送收益平均值/元	差距/%	断供惩罚成本平均值	差距/%
静态	32 357	0	9 326	0
0.5	29 682	8.27	9 671	3.70
1.0	30 112	6.94	10 025	7.50
1.5	30 422	5.98	10 392	11.43
2.0	31 372	3.04	10 693	14.66
2.5	31 254	3.41	11 118	19.22
3.0	30 770	4.90	11 455	22.83