Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (11): 2616-2630.doi: 10.16182/j.issn1004731x.joss.23-0926
Lu Bin1,2, Wang Minghan1,2, Sun Yang1,2, Yang Zhenyu1,2
Received:
2023-07-21
Revised:
2023-09-12
Online:
2024-11-13
Published:
2024-11-19
Contact:
Wang Minghan
CLC Number:
Lu Bin, Wang Minghan, Sun Yang, Yang Zhenyu. Global-local Fusion for Efficient 3D Object Detection[J]. Journal of System Simulation, 2024, 36(11): 2616-2630.
Table 2
Comparison of AP3D between the proposed method and others (R40)
Method | Stage | Car_AP3D(IoU=0.7)% | ||
---|---|---|---|---|
Easy | Moderate | Hard | ||
PointRCNN[ | Two | 85.94 | 75.76 | 68.32 |
Point-GNN[ | 88.33 | 79.47 | 72.29 | |
VoTr-TSD[ | 89.90 | 82.09 | 79.14 | |
CT3D[ | 87.83 | 81.77 | 77.16 | |
BtcDet[ | 90.64 | 82.86 | 78.09 | |
VoxelNet[ | One | 77.82 | 65.11 | 62.85 |
SECOND[ | 83.34 | 73.66 | 66.20 | |
PointPillar[ | 86.46 | 77.28 | 74.65 | |
SE-SSD[ | 91.49 | 82.54 | 77.15 | |
Ours | 91.21 | 82.97 | 80.28 | |
Method | Stage | Cyclist_AP3D(IoU=0.7)% | ||
Easy | Moderate | Hard | ||
PointRCNN[ | Two | 92.51 | 71.89 | 67.48 |
Point-GNN[ | ‒ | ‒ | ‒ | |
VoTr-TSD[ | ‒ | ‒ | ‒ | |
CT3D[ | 89.01 | 71.88 | 67.91 | |
BtcDet[ | ‒ | ‒ | ‒ | |
VoxelNet[ | One | ‒ | ‒ | ‒ |
SECOND[ | 82.96 | 66.74 | 62.78 | |
PointPillar[ | 81.58 | 62.94 | 58.98 | |
SE-SSD[ | ‒ | ‒ | ‒ | |
Ours | 87.81 | 69.94 | 65.33 |
1 | Qi R, Su Hao, Mo Kaichun, et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2017: 77-85. |
2 | Qi R, Yi Li, Su Hao, et al. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, Inc., 2017: 5105-5114. |
3 | Lang A H, Vora S, Caesar H, et al. PointPillars: Fast Encoders for Object Detection From Point Clouds[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 12689-12697. |
4 | Zhou Sifan, Tian Zhi, Chu Xiangxiang, et al. FastPillars: A Deployment-friendly Pillar-based 3D Detector[EB/OL]. (2023-02-07) [2023-03-08]. . |
5 | Shi Guangsheng, Li Ruifeng, Ma Chao. PillarNet: Real-time and High-performance Pillar-based 3D Object Detection[EB/OL]. (2022-08-26) [2023-03-26]. . |
6 | Yin Tianwei, Zhou Xingyi, Krähenbühl Philipp. Center-based 3D Object Detection and Tracking[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 11779-11788. |
7 | Zhou Yin, Tuzel O. VoxelNet: End-to-end Learning for Point Cloud Based 3D Object Detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 4490-4499. |
8 | Yan Yan, Mao Yuxing, Li Bo. SECOND: Sparsely Embedded Convolutional Detection[J]. Sensors, 2018, 18(10): 3337. |
9 | Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, Inc., 2017: 6000-6010. |
10 | Li Jiashi, Xia Xin, Li Wei, et al. Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios[EB/OL]. (2022-08-16) [2023-04-11]. . |
11 | Li Jiale, Luo Shujie, Zhu Ziqi, et al. 3D IoU-net: IoU Guided 3D Object Detector for Point Clouds[EB/OL]. (2020-04-10) [2023-04-05]. . |
12 | Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015: 91-99. |
13 | Li Xiang, Wang Wenhai, Wu Lijun, et al. Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, Inc., 2020: 21002-21012. |
14 | Zheng Zhaohui, Wang Ping, Liu Wei, et al. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000. |
15 | Zhou Dingfu, Fang Jin, Song Xibin, et al. IoU Loss for 2D/3D Object Detection[C]//2019 International Conference on 3D Vision (3DV). Piscataway: IEEE, 2019: 85-94. |
16 | Zheng Wu, Tang Weiliang, Jiang Li, et al. SE-SSD: Self-ensembling Single-stage Object Detector from Point Cloud[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 14489-14498. |
17 | Sheng Hualian, Cai Sijia, Zhao Na, et al. Rethinking IoU-based Optimization for Single-stage 3D Object Detection[C]//Computer Vision – ECCV 2022. Cham: Springer Nature Switzerland, 2022: 544-561. |
18 | Shi Shaoshuai, Wang Xiaogang, Li Hongsheng. PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 770-779. |
19 | Yang Zetong, Sun Yanan, Liu Shu, et al. 3DSSD: Point-based 3D Single Stage Object Detector[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 11037-11045. |
20 | Pan Xuran, Xia Zhuofan, Song Shiji, et al. 3D Object Detection with Pointformer[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 7459-7468. |
21 | Shi Weijing, Rajkumar R. Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 1708-1716. |
22 | Ge Runzhou, Ding Zhuangzhuang, Hu Yihan, et al. AFDet: Anchor Free One Stage 3D Object Detection[EB/OL]. (2020-06-30) [2023-04-26]. . |
23 | Hu Yihan, Ding Zhuangzhuang, Ge Runzhou, et al. AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(1): 969-979. |
24 | Zheng Wu, Tang Weiliang, Chen Sijin, et al. CIA-SSD: Confident IoU-aware Single-stage Object Detector from Point Cloud[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, 35(4): 3555-3562. |
25 | Fan Lue, Pang Ziqi, Zhang Tianyuan, et al. Embracing Single Stride 3D Object Detector with Sparse Transformer[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 8448-8458. |
26 | Zou Jiayu, Tian Kun, Zhu Zheng, et al. DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(7), 7846-7854. |
27 | Li Bo, Zhang Tianlei, Xia Tian. Vehicle Detection from 3D Lidar Using Fully Convolutional Network[EB/OL]. (2016-08-29) [2023-05-31]. . |
28 | Beltrán Jorge, Guindel Carlos, Francisco Miguel Moreno, et al. BirdNet: A 3D Object Detection Framework from LiDAR Information[C]//2018 21st International Conference on Intelligent Transportation Systems (ITSC). Piscataway: IEEE, 2018: 3517-3523. |
29 | Wang Tai, Zhu Xinge, Lin Dahua. Reconfigurable Voxels: A New Representation for LiDAR-based Point Clouds[C]//Proceedings of the 2020 Conference on Robot Learning. Chia Laguna Resort: PMLR, 2021: 286-295. |
30 | Wu Hai, Wen Chenglu, Li Wei, et al. Transformation-equivariant 3D Object Detection for Autonomous Driving[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(3): 2795-2802. |
31 | Wu Xiaopei, Peng Liang, Yang Honghui, et al. Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2023: 5408-5417. |
32 | Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[EB/OL]. (2021-06-03) [2023-05-28]. . |
33 | Zhou Yin, Sun Pei, Zhang Yu, et al. End-to-end Multi-view Fusion for 3D Object Detection in LiDAR Point Clouds[C]//Proceedings of the Conference on Robot Learning. Chia Laguna Resort: PMLR, 2020: 923-932. |
34 | Liu Ze, Lin Yutong, Cao Yue, et al. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 9992-10002. |
35 | Zhao Hengshuang, Jiang Li, Jia Jiaya, et al. Point Transformer[EB/OL]. (2021-09-26) [2023-05-13]. . |
36 | Mao Jiageng, Xue Yujing, Niu Minzhe, et al. Voxel Transformer for 3D Object Detection[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 3144-3153. |
37 | He Chenhang, Li Ruihuang, Li Shuai, et al. Voxel Set Transformer: A Set-to-set Approach to 3D Object Detection from Point Clouds[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2022: 8407-8417. |
38 | Chen Xuanyao, Liu Zhijian, Tang Haotian, et al. SparseViT: Revisiting Activation Sparsity for Efficient High-resolution Vision Transformer[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2023: 2061-2070. |
39 | He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 770-778. |
40 | Xia Xin, Li Jiashi, Wu Jie, et al. TRT-ViT: TensorRT-oriented Vision Transformer[EB/OL]. (2022-07-12) [2023-06-01]. . |
41 | Li Yanyu, Yuan Geng, Wen Yang, et al. EfficientFormer: Vision Transformers at MobileNet Speed[C]//Advances in Neural Information Processing Systems. Red Hook: Curran Associates, Inc., 2022: 12934-12949. |
42 | Li Xiang, Wang Wenhai, Hu Xiaolin, et al. Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 11627-11636. |
43 | Geiger Andreas, Lenz Philip, Urtasun R. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2012: 3354-3361. |
44 | Sheng Hualian, Cai Sijia, Liu Yuan, et al. Improving 3D Object Detection with Channel-wise Transformer[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2021: 2723-2732. |
45 | Xu Qiangeng, Zhong Yiqi, Neumann U. Behind the Curtain: Learning Occluded Shapes for 3D Object Detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(3), 2893-2901. |
[1] | Li Chao, Li Jiabao, Ding Caichang, Ye Zhiwei, Zuo Fangwei. Edge Surveillance Task Offloading and Resource Allocation Algorithm Based on DRL [J]. Journal of System Simulation, 2024, 36(9): 2113-2126. |
[2] | Huo Hanlin, Zou Xiangjun, Chen Yan, Zhou Xinzhao, Chen Mingyou, Li Chengen, Pan Yaoqiang, Tang Yunchao. Visual Robot Obstacle Avoidance Planning and Simulation Using Mapped Point Clouds [J]. Journal of System Simulation, 2024, 36(9): 2149-2158. |
[3] | Wang Haichao, Yin Yong, Jing Qianfeng, Cong Lin. Estimation of the Berthing Parameter of Unmanned Surface Vessels Based on 3D LiDAR [J]. Journal of System Simulation, 2024, 36(8): 1737-1748. |
[4] | Ge Chengpeng, Zhao Dong, Wang Rui, Ma Qinghua. Section Point Cloud Denoising Method Based on Enhanced DBSCAN and Distance Consensus Evaluation [J]. Journal of System Simulation, 2024, 36(8): 1800-1809. |
[5] | Liu Peijin, Fu Xuefeng, Sun Haofeng, He Lin, Liu Shujie. A Highly Robust Target Tracking Algorithm Merging CNN and Transformer [J]. Journal of System Simulation, 2024, 36(8): 1854-1868. |
[6] | Lu Yang, Liu Pengfei, Xu Siyuan, Liu Qiwang, Gu Fuqian, Wang Peng. Simulation of Rice Disease Recognition Based on Improved Attention Mechanism Embedded in PR-Net Model [J]. Journal of System Simulation, 2024, 36(6): 1322-1333. |
[7] | Liu Jinhui, Chen Mengyuan, Han Pengpeng, Chen Hebao, Zhang Yukun. A Graph Neural Network Visual SLAM Algorithm for Large-angle View Motion [J]. Journal of System Simulation, 2024, 36(5): 1043-1060. |
[8] | Zhang Xiliu, Zhang Xiaoling, He Minjun. Research on Vehicle Detection Method Based on Improved YOLOX-s [J]. Journal of System Simulation, 2024, 36(2): 487-496. |
[9] | Wu Yunpeng, Fu Yingxiong, Shen Lijun, Cui Feng. Traffic Sign Recognition Model with Long-Tail Distribution Based on YOLOX-Tiny [J]. Journal of System Simulation, 2024, 36(11): 2503-2516. |
[10] | Wang Gaihua, Li Kehong, Long Qian, Yao Jingxuan, Zhu Bolun, Zhou Zhengshu, Pan Xuran. Object Detection of Lightweight Transformer Based on Knowledge Distillation [J]. Journal of System Simulation, 2024, 36(11): 2517-2527. |
[11] | Li Weigang, Yu Chuxiang, Wang Yongqiang, Zou Shaofeng. Real-time Lidar SLAM Algorithm Based on Distribution Optimal Registration [J]. Journal of System Simulation, 2024, 36(11): 2566-2577. |
[12] | Xu Zhongkai, Liu Yanling, Sheng Xiaojuan, Wang Chao, Ke Wenjun. Automatic Detection Algorithm for Typical Defects of Substation Based on Improved YOLOv5 [J]. Journal of System Simulation, 2024, 36(11): 2604-2615. |
[13] | Su Tong, Wang Ying, Deng Qiyang, Li Zhaobin. Improved Foggy Pedestrian and Vehicle Detection Algorithm Based on YOLOv5 [J]. Journal of System Simulation, 2024, 36(10): 2413-2422. |
[14] | Huang Hongzhi, Yan Kai, Liu Changfeng, Wang Jianwen, Luo Bin. Ground Robot Relocation Method Based on UAV Point Cloud Map [J]. Journal of System Simulation, 2024, 36(10): 2444-2454. |
[15] | Dong Qingqing, Wu Hao, Qian Wenhua, Kong Fengling. RGB-D Saliency Object Detection Based on Cross-refinement and Circular Attention [J]. Journal of System Simulation, 2023, 35(9): 1931-1947. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||