系统仿真学报 ›› 2024, Vol. 36 ›› Issue (11): 2517-2527.doi: 10.16182/j.issn1004731x.joss.24-0754

• 专栏:基于视觉的智能驾驶环境感知 • 上一篇    

基于知识蒸馏的轻量化Transformer目标检测

王改华1,2, 李柯鸿1, 龙潜1,3, 姚敬萱1, 朱博伦1, 周正书1, 潘旭冉1   

  1. 1.天津科技大学 人工智能学院,天津 300457
    2.湖北光信息与模式识别重点实验室,湖北 武汉 430205
    3.北京中科慧眼科技有限公司,北京 100020
  • 收稿日期:2024-07-15 修回日期:2024-09-19 出版日期:2024-11-13 发布日期:2024-11-19
  • 通讯作者: 李柯鸿
  • 第一作者简介:王改华(1979-),女,副教授,博士,研究方向为模式识别与图像处理。
  • 基金资助:
    湖北省光学信息与模式识别重点实验室开放基金(202306)

Object Detection of Lightweight Transformer Based on Knowledge Distillation

Wang Gaihua1,2, Li Kehong1, Long Qian1,3, Yao Jingxuan1, Zhu Bolun1, Zhou Zhengshu1, Pan Xuran1   

  1. 1.College of Artificial Intelligence, Tianjin University of Science & Technology, Tianjin 300457, China
    2.Hubei Key Laboratory of Optical Information and Pattern Recognition, Wuhan Institute of Technology, Wuhan 430205, China
    3.Beijing Smarter Technology Co. , Ltd, Beijing 100020, China
  • Received:2024-07-15 Revised:2024-09-19 Online:2024-11-13 Published:2024-11-19
  • Contact: Li Kehong

摘要:

在自动驾驶领域,目标检测的高效性和准确性尤为重要,基于Transformer结构的目标检测方法逐渐成为主流,省去了复杂的锚点生成和非极大值抑制。针对现有方法计算成本高和收敛速度慢的问题,设计了一种基于池化操作的轻量化Transformer目标检测模型(LPT),包含了池化主干网络和双池化注意力机制,设计了针对DETR(detection transformer)模型的通用知识蒸馏方法,将预测结果、查询向量和教师提取的特征作为知识传递给轻量化的Transformer模型,帮助其提升精确度性能。通过在MS COCO 2017数据集上的实验,验证经过蒸馏的LPT模型在自动驾驶中的应用潜力,实验结果表明:本文方法具有较好的准确性,与一些先进的方法相比具有一定优势。

关键词: 目标检测, 知识蒸馏, 轻量化, DETR, Transformer, 自动驾驶

Abstract:

In autonomous driving, the efficiency and accuracy of object detection are significant. Object detection based on Transformer structure has gradually become the mainstream method, eliminating the complex anchor generation and non-maximum suppression (NMS). It has problems of high computing cost and slow convergence. An object detection model of the based lightweight pooling transformer (LPT) is designed, which contains a pooling backbone network and dual pooling attention mechanism. A general knowledge distillation method is intended for the DETR (detection transformer) model, which transfers prediction results, query vector, and features extracted by the teacher as knowledge to the LPT model to improve its accuracy. To verify the application potential of the distilled LPT model in autonomous driving, extensive experiments are conducted on the MS COCO 2017 dataset. The results show that the method has great efficiency and accuracy, and is competitive with some advanced techniques.

Key words: object detection, knowledge distillation, lightweight, DETR(detection Transformer), Transformer, autonomous driving

中图分类号: