一种量化因子自适应学习量化训练算法

doi:10.16182/j.issn1004731x.joss.21-0175

系统仿真学报 ›› 2022, Vol. 34 ›› Issue (7): 1639-1650.doi: 10.16182/j.issn1004731x.joss.21-0175

• 仿真模型/系统置信度评估技术 • 上一篇下一篇

一种量化因子自适应学习量化训练算法

聂慧¹^,²(), 李康顺¹^,²^,³(), 苏洋¹

^1.东莞城市学院计算机与信息学院, 广东东莞 523430
^2.广东科技学院计算机学院, 广东东莞 523000
^3.华南农业大学数学与信息学院, 广东广州 510046

收稿日期:2021-03-07 修回日期:2021-06-09 出版日期:2022-07-30 发布日期:2022-07-20
通讯作者: 李康顺 E-mail:928753616@qq.com;likangshun@sina.com
第一作者简介:聂慧（1987-），女，硕士，讲师，研究方向为图像处理，机器学习。E-mail：928753616@qq.com
基金资助:
广东省教育厅重点领域专项（新一代信息技术）(2021ZDX1029);广东省自然科学基金面上项目(2020A1515010784);东莞城市学院青年教师发展基金(2021QJY003Z)

A Quantization Training Algorithm of Adaptive Learning Quantization Scale Fators

Hui Nie¹^,²(), Kangshun Li¹^,²^,³(), Yang Su¹

^1.School of Computer and Informatics, City College of Dongguan University of Technology, Dongguan 523430, China
^2.School of Computer Science, Guangdong University of Science and Technology, Dongguan 523000, China
^3.College of Mathematics and Informatics College, South China Agricultural University, Guangzhou 510046, China

Received:2021-03-07 Revised:2021-06-09 Online:2022-07-30 Published:2022-07-20
Contact: Kangshun Li E-mail:928753616@qq.com;likangshun@sina.com

摘要/Abstract

摘要：

深度神经网络因参数量过多而影响嵌入式部署，解决的办法之一是模型小型化（如模型量化，知识蒸馏等）。针对这一问题，提出了一种基于BN（batch normg lization）折叠的量化因子自适应学习的量化训练算法（简称为LSQ-BN算法）。采用单个CNN（convolutional neural）构造BN折叠以实现BN与CNN融合；在量化训练过程中，将量化因子设置成模型参数；提出了一种自适应量化因子初始化方案以解决量化因子难以初始化的问题。实验结果表明：8bit的权重和激活量化，量化模型的精度与FP32预制模型几乎一致；4bit的权重量化和8 bit的激活量化，量化模型的精度损失在3%以内。因此，LSQ-BN是一种优异的模型量化算法。

关键词: BN折叠, CNN卷积, 自适应初始化, 模型量化因子

Abstract:

Deep neural network model is difficult to effectively deploy in embedded terminals due to its excessive number of components, andone of the solutions is model miniaturization (such as model quantization, knowledge distillation, etc.). To address this problem, a quantization training algorithm (referred to as LSQ-BN algorithm) based on adaptive learning of quantizationscale factors with BN folding is proposed.A single CNN (convolutional neural) is usedtoconstruct BN folding and achieve BN and CNN fusion. During the process of quantitative training,the quantization scale factors are set as model parameters. An adaptive quantizationscale factor initialization scheme is proposed to solve the problem of difficult initialization of quantizationscale factors.The experimental results show that the precision of the quantized model is almost the same as that of the FP32 prefabricated model when the weight and activation are both 8bit quantization. When the weight is 4 bit quantization and the activation is 8bit quantization, the precision loss of the quantization model is within 3%. Therefore, LSQ-BN proposed in this paper is an excellent model quantization algorithm.

Key words: BN folding, CNN convolution, adaptive initialization, model quantization scale-factor

中图分类号:

TP391

聂慧,李康顺,苏洋 . 一种量化因子自适应学习量化训练算法[J]. 系统仿真学报, 2022, 34(7): 1639-1650.

Hui Nie,Kangshun Li,Yang Su . A Quantization Training Algorithm of Adaptive Learning Quantization Scale Fators[J]. Journal of System Simulation, 2022, 34(7): 1639-1650.

图/表 17

表1

图1

表2

量化范围

类别	符号项	取值(b表示给定的量化数据位长)
对于无符号的bit位量化	$Q P$	$Q P = 2 b - 1$
对于无符号的bit位量化	$Q N$	$Q N = 0$
对于有符号的bit位量化	$Q P$	$Q P = 2 b - 1 - 1$
对于有符号的bit位量化	$Q N$	$Q N = 2 b - 1 - 1$

表2

图2

表3

图3

图4

图5

图6

图7

图8

表4

表5

表6

图9

图10

图11

参考文献 21

1	Zhang L, Li K, Qi Y, et al. Local Feature Extracted by the Improved Bag of Features Method for Person re-Identification[J]. Neurocomputing (S0925-2312), 2021, 458: 690-700.
2	Tan Zhiping, Li Kangshun, Wang Yi. Differential Evolution with Adaptive Mutation Strategy Based on Fitness Landscape Analysis[J]. Information Sciences(S0020-0255), 2021,549:142-163.
3	阴敬方, 朱登明, 石敏, 等. 基于引导对抗网络的人体深度图像修补方法[J]. 系统仿真学报, 2020,32(7): 1312-1321.
	Yin Jingfang, Zhu Dengming, Shi Min, et al. Human Depth Image Repairing Method Based on Guided adversation Network[J]. Journal of System Simulation, 2020, 32(7): 1312-1321.
4	黄欣, 方钰, 顾梦丹. 基于卷积神经网络的X线胸片疾病分类研究[J]. 系统仿真学报, 2020, 32(6): 1188-1194.
	Huang Xin, Fang Yu, Gu Mengdan. Study on Disease Classification of X-chest Radiographs based on Convolutional Neural Network[J].Journal of System Simulation, 2020, 32(6): 1188-1194.
5	Jacob B, Kligys S, Chen B, et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-only Inference[C]// IEEE Conference on Computer Vision and Pattern recognition Salt Bake City, USA: IEEE, 2018: 2704-2713.
6	Banner Ron, Nahshan Yury, Soudry Daniel. Post Training 4-bit Quantization of Convolutional Networks for Rapid-Deployment[C]// Neural Information Processing Systems(NeurIPS).Vancouver: IEEE Press,2019: 7950-7958.
7	Mishchenko Yuriy, Goren Yusuf, Ming Sun, et al. Low-Bit Quantization and Quantization-Aware Training for Small-Footprint Keyword Spotting[C]// International Conference On Machine Learning And Applications (ICMLA). Florida: IEEE Press, 2019: 706-711.
8	Zhang Xiangyu, Zhou Xinyu, Lin Mengxiao, et al.ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices [C]// Computer Vision and Pattern Recognition (CVPR). Salt Lake City, USA: IEEE Press, 2018: 6848-6856.
9	Yin Penghang, Zhang Shuai, Jiancheng Lyuet al. BinaryRelax: A Relaxation Approach for Training Deep Neural Networks with Quantized Weights[J]. SIAM Journal on Imaging Sciences (S1936-494), 2018, 11(4): 2205-2223.
10	Cao Z, Long M, Wang J, et al. Hashnet: Deep Learning to Hash by Continuation[C]// International Conference on computer vision. Venice, Italy: IEEE Press, 2017: 5608-5617.
11	Nagel Markus, van Baalen Mart, Blankevoort Tijmen, et al. Data-Free Quantization Through Weight Equalization and Bias Correction [C]// International Conference on Computer Vision (ICCV). Seoul, Korea: IEEE Press, 2019: 1325-1334.
12	Esser Steven K., McKinstry Jeffrey L., Bablani Deepika, et al. Learned Step Size Quantization[C]// International Conference on Learning Representations (ICLR).Ethiopia, Africa: IEEE Press, 2020: 1-12.
13	Bhalgat Y, Lee J, Nagel M, et al. Lsq+: Improving low-bit quantization through learnable offsets and better initialization[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE/CVF Press, 2020: 696-697.
14	Jain S, Gural A, Wu M, et al. Trained Quantization Thresholds for Accurate and Efficient Fixed-point Inference of Deep Neural Networks[J]. Machine Learning and Systems (S1002-137X), 2020, 2: 112-128.
15	Cai Zhaowei, He Xiaodong, Jian Sun, et al. Deep Learning With Low Precision by Half-Wave Gaussian Quantization[C]// Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE Press, 2017:5918-5926.
16	Choi Jungwook, Venkataramani Swagath, Vijayalakshmi (Viji) Srinivasan, et al. Accurate and Efficient 2-bit Quantized Neural Networks[J]. Proceedings of Machine Learning and Systems (S1002-137X), 2019, 1: 348-359.
17	Ioffe S, Szegedy C. Batch normalization: Accelerating Deep Network Training by Reducing Internal Covariate shift[C]// International Conference on Machine Learning. PMLR, Miami, Horida, USA: IEEE, 2015: 448-456.
18	Chmiel B, Banner R, Shomron G, et al. Robust Quantization: One Model to Rule Them All[M]. Vancouver: Advances in Neural Information Processing Systems, 2020: 5308-5317.
19	Debnath Bappaditya, O'Brien Mary, Yamaguchi Motonori, et al. Adapting MobileNets for Mobile Based Upper Body Pose Estimation[C]// Advanced Video and Signal Based Surveillance (AVSS). Auckland, Auckland: IEEE Press, 2018: 1-6.
20	Sandler M, Howard A, Zhu M, et al. Mobilenetv2: Inverted Residuals and Linear Bottlenecks[C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City,USA: IEEE, 2018: 4510-4520.
21	He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]// Computer Vision and Pattern Recognition. Las vegas, USA: IEEE Press, 2016: 770-778.

bit 长度	最小值	最大值
INT8	-128	127
UINT8	0	255
FP32	-3.4e38	3.4e38

序号	融合形式	融合后的算子
1	conv+BN	ConvBn算子
2	conv+BN+ReLU	ConvBnRelu算子
3	conv+BN+ReLU6	ConvBnRelu6算子

算法	量化精度（W/A）	准确率(%)Top-1	精度损失(FP32-其他)	压缩率
	MobileNet-v 2-1.0-224
FP32(ours)	32/32	72.5	-
LSQ-BN(ours)	8/8	72.3	0.2	4x
LSQ-BN(ours)	4/8	69.8	2.7	8x
PTQ	8/8	68.7	3.8	4x
PTQ	4/8	67.2	5.3	8x
QAT	8/8	70.9	1.4	4x
QAT	4/8	62	10.5	8x
LSQ	8/8	72.9	-0.4	4x
LSQ	4/8	70.3	2.2	4x

算法	量化精度（W/A）	准确率(%)Top-1	精度损失(FP32-其他)	压缩率
	ResNet-v1-50
FP32(ours)	32/32	75.6	-
LSQ-BN(ours)	8/8	75.2	0.4	4x
LSQ-BN(ours)	4/8	74.1	1.5	8x
PTQ	8/8	69.8	5.8	4x
PTQ	4/8	67.9	7.7	8x
QAT	8/8	71.2	4.4	4x
QAT	4/8	64.9	10.7	8x
LSQ	8/8	75.0	0.6	4x
LSQ	4/8	73.8	1.8	4x

算法	量化精度（W/A）	准确率(%)Top-1	精度损失(FP32-其他)	压缩率
	ResNet-v1-101
FP32(ours)	32/32	76.2	-
LSQ-BN(ours)	8/8	75.7	0.5	4x
LSQ-BN(ours)	4/8	75.1	1.1	8x
PTQ	8/8	73.6	2.6	4x
PTQ	4/8	70.4	5.8	8x
QAT	8/8	75.8	0.4	4x
QAT	4/8	74.5	1.7	8x
LSQ	8/8	75.5	0.7	4x
LSQ	4/8	75.1	1.1	4x

一种量化因子自适应学习量化训练算法

A Quantization Training Algorithm of Adaptive Learning Quantization Scale Fators

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 17

参考文献 21

相关文章 15

编辑推荐

Metrics

本文评价

[1]	黄涛, 张智, 丁玉杰, 陈艳波, 王晶, 张文倩. 考虑动态频率安全与N-k故障的鲁棒应急调度方法[J]. 系统仿真学报, 2025, 37(12): 2981-2993.
[2]	张润昭, 陈艳波, 黄涛, 田昊欣, 强涂奔, 张智. 基于异构负荷特征解析预测的虚拟电厂调度方法[J]. 系统仿真学报, 2025, 37(12): 2994-3006.
[3]	于祥星, 赵艳东, 张宝琳. 基于电涡流NES的海上风机塔架振动控制[J]. 系统仿真学报, 2025, 37(12): 3007-3017.
[4]	李斌, 王于绰. 基于多策略融合的光伏系统故障诊断方法[J]. 系统仿真学报, 2025, 37(12): 3018-3032.
[5]	李孝斌, 胡冰, 尹超, 李波, 马军. 基于时空图卷积的汽车配件供应链需求预测与仿真分析[J]. 系统仿真学报, 2025, 37(12): 3060-3074.
[6]	彭艺, 雷云揆, 杨青青, 李辉, 王健明. 改进PID搜索算法的山地环境无人机路径规划[J]. 系统仿真学报, 2025, 37(12): 3075-3086.
[7]	伍枢珩, 刘永奎, 张霖, 肖莹莹, 王力翚. 基于改进YOLOv8的轻量级装配工件检测算法[J]. 系统仿真学报, 2025, 37(12): 3099-3111.
[8]	陈逸, 邱思航, 朱正秋, 季雅泰, 赵勇, 鞠儒生. 基于启发式的人-大模型协作寻源方法[J]. 系统仿真学报, 2025, 37(12): 3112-3127.
[9]	任亮, 周泽榕, 马云峰. “货到人”系统订单拣选和分拣协同优化问题[J]. 系统仿真学报, 2025, 37(12): 3128-3139.
[10]	索婧怡, 卢柏宏, 屈澈. 影视LED光源光强分布测定及其在游戏引擎中的仿真研究[J]. 系统仿真学报, 2025, 37(12): 3140-3151.
[11]	龚建兴, 胡海, 任海慧, 吴瑞祥. 面向虚实结合的军事训练系统互操作模型与运用[J]. 系统仿真学报, 2025, 37(12): 3161-3175.
[12]	徐智霞, 王蕊, 孙楠, 何兵, 沈晓卫, 朱晓菲. 基于改进遗传算法的协同干扰资源分配问题研究[J]. 系统仿真学报, 2025, 37(12): 3176-3189.
[13]	刘翔, 金乾坤. 基于PAC-Bayes的多目标强化学习A2C算法研究[J]. 系统仿真学报, 2025, 37(12): 3212-3223.
[14]	杨兰英, 李超, 邹海锋, 万江涛, 张仁强, 刘惠, 卢宏. 基于改进蚁群算法与A*算法相融合的机器人路径规划优化[J]. 系统仿真学报, 2025, 37(11): 2956-2965.
[15]	苏筱婷, 张小威, 田义, 李奇, 王帅豪. 星光导航动态仿真场景时序设计方法研究[J]. 系统仿真学报, 2025, 37(11): 2946-2955.