Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (11): 2517-2527.doi: 10.16182/j.issn1004731x.joss.24-0754
Wang Gaihua1,2, Li Kehong1, Long Qian1,3, Yao Jingxuan1, Zhu Bolun1, Zhou Zhengshu1, Pan Xuran1
Received:
2024-07-15
Revised:
2024-09-19
Online:
2024-11-13
Published:
2024-11-19
Contact:
Li Kehong
CLC Number:
Wang Gaihua, Li Kehong, Long Qian, Yao Jingxuan, Zhu Bolun, Zhou Zhengshu, Pan Xuran. Object Detection of Lightweight Transformer Based on Knowledge Distillation[J]. Journal of System Simulation, 2024, 36(11): 2517-2527.
Table 2
Influence of distillation model on DETR model
模型 | 模块 | Backbone | AP | APS | APM | APL |
---|---|---|---|---|---|---|
Deformable DETR | 教师 | Resnet-101 | 45.5 | 27.5 | 48.7 | 60.3 |
学生(未蒸馏) | Resnet-50 | 44.1 | 27.0 | 47.4 | 58.3 | |
学生(蒸馏) | Resnet-50 | 46.6 | 28.5 | 48.6 | 61.0 | |
Conditional DETR | 教师 | Resnet-101 | 42.4 | 22.6 | 46.0 | 61.2 |
学生(未蒸馏) | Resnet-50 | 40.7 | 20.3 | 43.8 | 60.0 | |
学生(蒸馏) | Resnet-50 | 42.9 | 21.6 | 46.5 | 62.2 | |
LPT | 教师 | HgnetV2 | 48.1 | 29.3 | 51.9 | 66.4 |
学生(未蒸馏) | Pooling backbone | 45.7 | 27.8 | 49.0 | 63.8 | |
学生(蒸馏) | Pooling backbone | 48.3 | 28.9 | 49.7 | 65.5 |
Table 3
Test results of DETR type models on the MS COCO 2017 dataset
模型 | 参数量/M | 计算复杂度/G | 帧率/(帧/s) | AP | AP50 | AP75 | APS | APM | APL |
---|---|---|---|---|---|---|---|---|---|
DETR | 41.580 | 86.556 | 14.80 | 15.5 | 29.4 | 14.5 | 4.3 | 15.1 | 26.7 |
DAB-DETR | 43.722 | 90.740 | 10.80 | 38.0 | 60.3 | 39.8 | 19.2 | 40.9 | 55.4 |
RT-DETR | 42.940 | 69.157 | 19.99 | 47.0 | 64.6 | 50.8 | 28.5 | 51.1 | 65.2 |
Ours(未蒸馏) | 41.814 | 60.949 | 22.02 | 45.7 | 63.4 | 48.9 | 27.8 | 49.0 | 63.8 |
Ours(蒸馏) | 41.814 | 60.949 | 22.02 | 48.3 | 64.4 | 51.2 | 28.9 | 49.7 | 65.5 |
1 | 卢裕秋, 孙金玉, 马世伟. 基于深度卷积神经网络的运动目标检测方法[J]. 系统仿真学报, 2019, 31(11): 2275-2280. |
Lu Yuqiu, Sun Jinyu, Ma Shiwei. Moving Object Detection Based on Deep Convolutional Neural Network[J]. Journal of System Simulation, 2019, 31(11): 2275-2280. | |
2 | 张稀柳, 张晓玲, 何敏军. 基于改进YOLOX-s的车辆检测方法研究[J]. 系统仿真学报, 2024, 36(2): 487-496. |
Zhang Xiliu, Zhang Xiaoling, He Minjun. Research on Vehicle Detection Method Based on Improved YOLOX-s[J]. Journal of System Simulation, 2024, 36(2): 487-496. | |
3 | 石敏, 姚瀚钦, 李淳芃, 等. 基于深度Alignment网络的足部测量[J]. 系统仿真学报, 2020, 32(7): 1267-1278. |
Shi Min, Yao Hanqin, Li Chunpeng, et al. Foot Measurement Based on Deep Alignment Network[J]. Journal of System Simulation, 2020, 32(7): 1267-1278. | |
4 | Girshick R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2015: 1440-1448. |
5 | Liu Wei, Anguelov D, Erhan D, et al. SSD: Single Shot MultiBox Detector[C]//Computer Vision – ECCV 2016. Cham: Springer International Publishing, 2016: 21-37. |
6 | Redmon J, Divvala S, Girshick R, et al. You Only Look Once: Unified, Real-time Object Detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 779-788. |
7 | Zhang Shifeng, Chi Cheng, Yao Yongqiang, et al. Bridging the Gap Between Anchor-based and Anchor-free Detection Via Adaptive Training Sample Selection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 9756-9765. |
8 | Carion N, Massa F, Synnaeve G, et al. End-to-end Object Detection with Transformers[C]//Computer Vision – ECCV 2020. Cham: Springer International Publishing, 2020: 213-229. |
9 | Zhu Xizhou, Su Weijie, Lu Lewei, et al. Deformable DETR: Deformable Transformers for End-to-end Object Detection[EB/OL]. (2021-03-18) [2023-11-21]. . |
10 | Dai Xiyang, Chen Yinpeng, Yang Jianwei, et al. Dynamic DETR: End-to-end Object Detection with Dynamic Attention[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 2968-2977. |
11 | Li Feng, Zhang Hao, Liu Shilong, et al. DN-DETR: Accelerate DETR Training by Introducing Query DeNoising[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 13609-13617. |
12 | 高昕, 甄国涌, 储成群, 等. 基于改进YOLOv5的自动驾驶目标检测方法[J]. 科学技术与工程, 2024, 24(16): 6757-6765. |
Gao Xin, Zhen Guoyong, Chu Chengqun, et al. Autonomous Driving Target Detection Method Based on Improved YOLOv5[J]. Science Technology and Engineering, 2024, 24(16): 6757-6765. | |
13 | Hinton G, Vinyals O, Dean J. Distilling the Knowledge in a Neural Network[EB/OL]. (2015-03-09) [2024-01-15]. . |
14 | Chen Guobin, Choi W, Yu Xiang, et al. Learning Efficient Object Detection Models with Knowledge Distillation[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 742-751. |
15 | Wang Tao, Yuan Li, Zhang Xiaopeng, et al. Distilling Object Detectors with Fine-grained Feature Imitation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 4928-4937. |
16 | Sun Ruoyu, Tang Fuhui, Zhang Xiaopeng, et al. Distilling Object Detectors with Task Adaptive Regularization[EB/OL]. (2020-06-23) [2024-02-09]. . |
17 | Zhang Linfeng, Ma Kaisheng. Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors[C]//ICLR 2021. New York: ICLR, 2020: 1-14. |
18 | Yang Zhendong, Li Zhe, Jiang Xiaohu, et al. Focal and Global Knowledge Distillation for Detectors[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 4633-4642. |
19 | 朱志豪, 王艳, 纪志成. 基于模型压缩的安瓿瓶外观检测仿真研究[J]. 系统仿真学报, 2022, 34(12): 2575-2583. |
Zhu Zhihao, Wang Yan, Ji Zhicheng. Simulation Research on Appearance Detection of Ampoules Based on Lightweight Network and Model Compression[J]. Journal of System Simulation, 2022, 34(12): 2575-2583. | |
20 | Yao Zhuyu, Ai Jiangbo, Li Boxun, et al. Efficient DETR: Improving End-to-end Object Detector with Dense Prior[EB/OL]. (2021-04-03) [2023-12-28]. . |
21 | Meng Depu, Chen Xiaokang, Fan Zejia, et al. Conditional DETR for Fast Training Convergence[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 3631-3640. |
22 | Roh Byungseok, Jae Woong Shin, Shin Wuhyun, et al. Sparse DETR: Efficient End-to-end Object Detection with Learnable Sparsity[EB/OL]. (2022-03-04) [2024-01-06]. . |
23 | Zhang Hao, Li Feng, Liu Shilong, et al. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-end Object Detection[EB/OL]. (2022-07-11) [2024-01-18]. . |
24 | Yu Weihao, Luo Mi, Zhou Pan, et al. MetaFormer is Actually What You Need for Vision[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 10809-10819. |
25 | Romero Adriana, Ballas Nicolas, Samira Ebrahimi Kahou, et al. FitNets: Hints for Thin Deep Nets[EB/OL]. (2015-03-27) [2024-02-21]. . |
26 | Zheng Zhaohui, Ye Rongguang, Hou Qibin, et al. Localization Distillation for Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(8): 10070-10083. |
27 | Zhao Yian, Wenyu Lü, Xu Shangliang, et al. DETRs Beat YOLOs on Real-time Object Detection[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2024: 16965-16974. |
28 | Liu Shilong, Li Feng, Zhang Hao, et al. DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR[EB/OL]. (2022-03-30) [2024-03-07]. . |
[1] | Li Dongxue, Liu Yan, Shen Boyao, Jing Yongteng, Ma Qiang, Liu Ran. Carbon Footprint Analysis and Low-carbon Optimization Method Simulation Study of Power Transformer Based on Digital Twin Technology [J]. Journal of System Simulation, 2024, 36(9): 2075-2085. |
[2] | Liu Peijin, Fu Xuefeng, Sun Haofeng, He Lin, Liu Shujie. A Highly Robust Target Tracking Algorithm Merging CNN and Transformer [J]. Journal of System Simulation, 2024, 36(8): 1854-1868. |
[3] | Li Xiang, Sang Haifeng. Dense Video Description Method Based on Multi-modal Fusion in Transformer Network [J]. Journal of System Simulation, 2024, 36(5): 1061-1071. |
[4] | Shi Lanxi, Yan Wenxu, Ni Hongyu, Zhao Feng. Research on Dynamic Scene SLAM Based on Improved Object Detection [J]. Journal of System Simulation, 2024, 36(4): 1028-1042. |
[5] | Yang Zhe, Cui Yinghan, Guo Lingxi, Li Jiaxin, Wu Xusheng. Search Technology for Aircraft Debris Integrating Data Augmentation and Deep Learning Algorithm [J]. Journal of System Simulation, 2024, 36(10): 2238-2245. |
[6] | Su Tong, Wang Ying, Deng Qiyang, Li Zhaobin. Improved Foggy Pedestrian and Vehicle Detection Algorithm Based on YOLOv5 [J]. Journal of System Simulation, 2024, 36(10): 2413-2422. |
[7] | Dong Qingqing, Wu Hao, Qian Wenhua, Kong Fengling. RGB-D Saliency Object Detection Based on Cross-refinement and Circular Attention [J]. Journal of System Simulation, 2023, 35(9): 1931-1947. |
[8] | Yang Li, Huijuan Zhang, Chenchen Ge, Kang Xie, Zhuang Li, Jinyuan Jia. Lightweight WebVR Real-Time Simulation of Large-Scale Fire Scenario in Metro [J]. Journal of System Simulation, 2023, 35(3): 646-657. |
[9] | Xu Renjie, Zhang Xiaoming, Wang Chen, Wu Peng. Research on 3D Object Detection Method with Cross-module Attention [J]. Journal of System Simulation, 2023, 35(12): 2680-2691. |
[10] | Shiqi Lin, Jikai Wang, Haoyuan Pei, Hao Zhao, Zonghai Chen. Monocular Semantic SLAM Method Based on Object Relation Description [J]. Journal of System Simulation, 2022, 34(2): 278-284. |
[11] | Zhihao Zhu, Yan Wang, Zhicheng Ji. Simulation Research on Appearance Detection of Ampoules Based on Lightweight Network and Model Compression [J]. Journal of System Simulation, 2022, 34(12): 2575-2583. |
[12] | Xuqiang Shao, Haowei Zhang, Xiaohua Feng. Multi-sensory Fusion Method for Power Transformer Virtual Assembly [J]. Journal of System Simulation, 2022, 34(10): 2244-2254. |
[13] | Liu Xiaojun, He Changyan, Liu Chang, Jia Jinyuan. Fast Alignment of BIM Products Based on Structure Matching [J]. Journal of System Simulation, 2021, 33(7): 1626-1637. |
[14] | Zhang Huijuan, Liu Fan, Wang Dongqing, Jia Jinyuan. Parameterization of Complex Pipeline Meshes and Its Large-scale Online Visualization [J]. Journal of System Simulation, 2020, 32(8): 1489-1497. |
[15] | Liu Jiazhe, Chen Chunyi, Hu Xiaojuan, Liang Weidong, Xing Qiwei, Yang Huamin. Mobile-phone-oriented Stereoscopic Display and Interaction Framework for Cloud-based Virtual Reality 3D Scenes [J]. Journal of System Simulation, 2020, 32(7): 1360-1374. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||