Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (8): 1854-1868.doi: 10.16182/j.issn1004731x.joss.23-0833
• Papers • Previous Articles
Liu Peijin1, Fu Xuefeng1, Sun Haofeng3, He Lin2, Liu Shujie1
Received:
2023-07-04
Revised:
2023-09-18
Online:
2024-08-15
Published:
2024-08-19
Contact:
He Lin
CLC Number:
Liu Peijin, Fu Xuefeng, Sun Haofeng, He Lin, Liu Shujie. A Highly Robust Target Tracking Algorithm Merging CNN and Transformer[J]. Journal of System Simulation, 2024, 36(8): 1854-1868.
Table 1
Architecture of feature extraction network
层结构 | 输出通道 | 模板图像大小 | 搜索图像大小 |
---|---|---|---|
Parameters (M) | 4.98 | ||
输入图像 | 128×128 | 256×256 | |
3×3卷积 | 16 | 64×64 | 128×128 |
倒残差模块 | 32 | ||
倒残差模块,下采样 | 64 | 32×32 | 64×64 |
倒残差模块 | 64 | ||
倒残差模块,下采样 | 128 | 16×16 | 32×32 |
类卷积Transformer模块,N=2 | 128 | ||
倒残差模块,下采样 | 256 | 8×8 | 16×16 |
类卷积Transformer模块,N=4 | 256 | ||
倒残差模块,下采样 | 320 | 4×4 | 8×8 |
类卷积Transformer模块,N=3 | 320 | ||
1×1卷积 | 1 024 | 4×4 | 8×8 |
Table 2
Comparison results of different algorithms on VOT2018 and VOT2019 dataset
算法 | VOT2018 | VOT2019 | ||||
---|---|---|---|---|---|---|
EAO↑ | A↑ | R↓ | EAO↑ | A↑ | R↓ | |
Ours | 0.490 | 0.617 | 0.112 | 0.578 | 0.266 | |
Ocean | 0.592 | 0.350 | 0.594 | 0.316 | ||
SiamBAN | 0.452 | 0.597 | 0.178 | 0.327 | 0.396 | |
DiMP | 0.440 | 0.597 | 0.153 | 0.379 | 0.594 | |
SiamRPN++ | 0.414 | 0.600 | 0.234 | 0.292 | 0.580 | 0.446 |
SiamR-CNN | 0.408 | 0.220 | — | — | — | |
ATOM | 0.401 | 0.590 | 0.204 | 0.301 | 0.603 | 0.411 |
DaSiamRPN | 0.383 | 0.586 | 0.276 | — | — | — |
SiamFC | 0.188 | 0.503 | 0.585 | — | — | — |
Table 3
Comparison results of different algorithms on GOT-10k, LaSOT, UAV123 and NFS dataset
算法 | GOT-10k | LaSOT | OTB2015 | UAV123 | NFS/(30帧/s) | ||||
---|---|---|---|---|---|---|---|---|---|
AO | SR0.5 | SR0.75 | AUC | P | AUC | P | AUC | AUC | |
Ours | 70.8 | 82.4 | 65.8 | 67.7 | 70.2 | 71.1 | 92.2 | 71.9 | |
TransT | 69.4 | — | 69.1 | 65.7 | |||||
SiamR-CNN | 64.9 | 72.8 | 59.7 | 64.8 | 68.4 | 89.1 | 64.9 | 63.9 | |
Ocean | 61.1 | 72.1 | 47.3 | 56.0 | 56.6 | 68.4 | — | — | |
DiMP | 61.1 | 71.7 | 49.2 | 56.9 | 56.7 | 68.4 | 90.2 | 65.3 | 62.0 |
ATOM | 55.6 | 63.4 | 40.2 | 51.5 | 50.5 | 66.7 | 87.9 | 64.2 | 58.4 |
SiamBAN | — | — | — | 51.4 | 52.1 | 69.6 | 91.0 | 63.1 | 59.4 |
SiamRPN++ | 51.7 | 61.6 | 32.5 | 49.6 | 49.1 | 69.6 | 91.5 | 61.3 | 50.2 |
DaSiamRPN | 44.4 | 53.6 | 22.0 | 41.5 | — | 65.9 | 88.0 | 58.6 | 39.5 |
CFNet | 43.4 | 48.1 | 19.0 | 27.5 | 25.9 | 56.8 | 74.8 | 43.6 | — |
SiamFC | 34.8 | 35.3 | 9.8 | 33.6 | 33.9 | 58.2 | 77.1 | 46.1 | — |
fDSST | 28.9 | 27.8 | 12.1 | 20.3 | 18.4 | 55.1 | 72.6 | — | — |
1 | 张思贤, 杨艺, 张猛, 等. 高效的多特征自适应相关滤波跟踪器[J]. 系统仿真学报, 2022, 34(8): 1864-1873. |
Zhang Sixian, Yang Yi, Zhang Meng, et al. An Efficient Tracker via Multi-feature Adaptive Correlation Filter[J]. Journal of System Simulation, 2022, 34(8): 1864-1873. | |
2 | 周维, 刘宇翔, 廖广平, 等. 结合交并比损失的孪生网络目标跟踪算法研究[J]. 系统仿真学报, 2022, 34(9): 1956-1967. |
Zhou Wei, Liu Yuxiang, Liao Guangping, et al. Siamese Object Tracking Algorithm Combined with the Intersection over Union Loss[J]. Journal of System Simulation, 2022, 34(9): 1956-1967. | |
3 | Danelljan Martin, Häger Gustav, Fahad Shahbaz Khan, et al. Accurate Scale Estimation for Robust Visual Tracking[C]//Proceedings British Machine Vision Conference 2014. Durham: BMVA Press, 2014: 1-11. |
4 | Lukežic Alan, Vojír Tomáš, Luka Cehovin Zajc, et al. Discriminative Correlation Filter with Channel and Spatial Reliability[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2017: 4847-4856. |
5 | Danelljan Martin, Häger Gustav, Fahad Shahbaz Khan, et al. Discriminative Scale Space Tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1561-1575. |
6 | Henriques João F, Rui Caseiro, Martins Pedro, et al. High-speed Tracking with Kernelized Correlation Filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. |
7 | 李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. |
Li Xi, Zha Yufei, Zhang Tianzhu, et al. Survey of Visual Object Tracking Algorithms Based on Deep Learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080. | |
8 | Bertinetto L, Valmadre J, Henriques João F, et al. Fully-convolutional Siamese Networks for Object Tracking[C]//Computer Vision-ECCV 2016 Workshops. Cham: Springer International Publishing, 2016: 850-865. |
9 | Ren Shaoqing, He Kaiming, Girshick R, et al. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. |
10 | Li Bo, Yan Junjie, Wu Wei, et al. High Performance Visual Tracking with Siamese Region Proposal Network[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2018: 8971-8980. |
11 | Li Bo, Wu Wei, Wang Qiang, et al. SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 4277-4286. |
12 | He Kaiming, Zhang Xiangyu, Ren Shaoqing, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2016: 770-778. |
13 | Jiang Borui, Luo Ruixuan, Mao Jiayuan, et al. Acquisition of Localization Confidence for Accurate Object Detection[C]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 816-832. |
14 | Zhang Zhipeng, Peng Houwen. Deeper and Wider Siamese Networks for Real-time Visual Tracking[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 4586-4595. |
15 | Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates Inc., 2017: 6000-6010. |
16 | Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[EB/OL]. (2021-06-03) [2023-07-01]. . |
17 | Chen Xin, Yan Bin, Zhu Jiawen, et al. Transformer Tracking[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2021: 8122-8131. |
18 | 王春雷, 张建林, 李美惠, 等. 结合卷积Transformer的目标跟踪算法[J]. 计算机工程, 2023, 49(4): 281-288, 296. |
Wang Chunlei, Zhang Jianlin, Li Meihui, et al. Object Tracking Algorithm Combining Convolution and Transformer[J]. Computer Engineering, 2023, 49(4): 281-288, 296. | |
19 | Woo Sanghyun, Park Jongchan, Lee J Y, et al. CBAM: Convolutional Block Attention Module[C]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 3-19. |
20 | Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: Common Objects in Context[C]//Computer Vision-ECCV 2014. Cham: Springer International Publishing, 2014: 740-755. |
21 | Huang Lianghua, Zhao Xin, Huang Kaiqi. GOT-10k: A Large High-diversity Benchmark for Generic Object Tracking in the Wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577. |
22 | Fan Heng, Lin Liting, Yang Fan, et al. LaSOT: A High-quality Benchmark for Large-scale Single Object Tracking[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 5369-5378. |
23 | Müller Matthias, Bibi Adel, Giancola Silvio, et al. TrackingNet: A Large-scale Dataset and Benchmark for Object Tracking in the Wild[C]//Computer Vision-ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part I. Heidelberg: Springer-Verlag, 2018: 310-327. |
24 | Wu Yi, Lim Jongwoo, Yang Ming. Online Object Tracking: A Benchmark[C]//2013 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE, 2013: 2411-2418. |
25 | Kristan Matej, Leonardis Aleš, Matas Jiří, et al. The Sixth Visual Object Tracking VOT2018 Challenge Results[C]//Computer Vision-ECCV 2018 Workshops. Cham: Springer International Publishing, 2019: 3-53. |
26 | Kristan Matej, Matas Jirí, Leonardis Aleš, et al. The Seventh Visual Object Tracking VOT2019 Challenge Results[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Piscataway: IEEE, 2019: 2206-2241. |
27 | Mueller Matthias, Smith Neil, Ghanem Bernard. A Benchmark and Simulator for UAV Tracking[C]//Computer Vision-ECCV 2016. Cham: Springer International Publishing, 2016: 445-461. |
28 | Galoogahi H K, Fagg A, Huang Chen, et al. Need for Speed: A Benchmark for Higher Frame Rate Object Tracking[C]//2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2017: 1134-1143. |
29 | Loshchilov Ilya, Hutter Frank. Decoupled Weight Decay Regularization[EB/OL]. (2019-01-04) [2023-01-10]. . |
30 | Zhang Zhipeng, Peng Houwen, Fu Jianlong, et al. Ocean: Object-aware Anchor-free Tracking[C]//Computer Vision-ECCV 2020. Cham: Springer International Publishing, 2020: 771-787. |
31 | Voigtlaender Paul, Luiten J, Torr P H S, et al. Siam R-CNN: Visual Tracking by Re-detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 6577-6587. |
32 | Bhat Goutam, Danelljan Martin, Luc Van Gool, et al. Learning Discriminative Model Prediction for Tracking[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Piscataway: IEEE, 2019: 6181-6190. |
33 | Zhu Zheng, Wang Qiang, Li Bo, et al. Distractor-aware Siamese Networks for Visual Object Tracking[C]//Computer Vision-ECCV 2018. Cham: Springer International Publishing, 2018: 103-119. |
34 | Danelljan Martin, Bhat Goutam, Fahad Shahbaz Khan, et al. ATOM: Accurate Tracking by Overlap Maximization[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2019: 4655-4664. |
35 | Chen Zedu, Zhong Bineng, Li Guorong, et al. Siamese Box Adaptive Network for Visual Tracking[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE, 2020: 6667-6676. |
36 | Zhang Gang, Li Ziyi, Li Jianmin, et al. CFNet: Cascade Fusion Network for Dense Prediction[EB/OL]. (2023-10-31) [2022-07-01]. . |
[1] | Li Xiang, Sang Haifeng. Dense Video Description Method Based on Multi-modal Fusion in Transformer Network [J]. Journal of System Simulation, 2024, 36(5): 1061-1071. |
[2] | Qiu Yunfei, Bu Xiangrui, Zhang Boqiang. Dynamic Spatio-temporal Anomaly-aware Correlation Filtering Object Tracking Algorithm [J]. Journal of System Simulation, 2024, 36(2): 338-351. |
[3] | Songming Jiao, Hui Ding, Yufei Zhong, Xin Yao, Jiahao Zheng. A UAV Target Tracking and Control Algorithm Based on SiamRPN [J]. Journal of System Simulation, 2023, 35(6): 1372-1380. |
[4] | Sixian Zhang, Yi Yang, Meng Zhang, Pengbo Mi. An Efficient Tracker via Multi-feature Adaptive Correlation Filter [J]. Journal of System Simulation, 2022, 34(8): 1864-1873. |
[5] | Dinghui Wu, Tongrui Zhang, Xiuli Zhang. Job Shop Rescheduling Under Recessive Disturbance Based on Digital Twin [J]. Journal of System Simulation, 2022, 34(3): 573-583. |
[6] | Xuqiang Shao, Haowei Zhang, Xiaohua Feng. Multi-sensory Fusion Method for Power Transformer Virtual Assembly [J]. Journal of System Simulation, 2022, 34(10): 2244-2254. |
[7] | Wei Zhou, Yuxiang Liu, Guangping Liao, Xin Ma. Siamese Object Tracking Algorithm Combined with the Intersection over Union Loss [J]. Journal of System Simulation, 2022, 34(09): 1956-1967. |
[8] | Li Qi, Mo Hanlin, Wang Xiangdong, Li Hua. Multiple Object Tracking and Kinematic Simulation for Short Track Speed Skating [J]. Journal of System Simulation, 2021, 33(5): 1039-1050. |
[9] | Zhang Liang, Hao Kaifeng. Matching between MAC Address and Object Based on RSSI Change Sequence [J]. Journal of System Simulation, 2020, 32(1): 113-121. |
[10] | Zheng Yanyan, Zhu Yongli, Liu Tongtong, Su Shanshan. Single-Phase Grounding Fault Location in Wind Farm Based on Zero-Sequence Current [J]. Journal of System Simulation, 2019, 31(7): 1408-1415. |
[11] | Wu Yunpu, Jin Weidong, Ren Junxiao. Fault Identification of High-Speed Train Bogie Based on Siamese Convolutional Neural Network [J]. Journal of System Simulation, 2019, 31(11): 2562-2568. |
[12] | Li Ju, Cao Mingwei, Yu Ye, XiaYu, Zhou Lifan. An Anti-occlusion Adaptive Particle Filtering Algorithm [J]. Journal of System Simulation, 2018, 30(9): 3552-3557. |
[13] | Tang Yongbo, Xiong Yinguo. Transformer Fault Diagnosis Based on Feature Extraction of Relative Transformation Principal Component Analysis [J]. Journal of System Simulation, 2018, 30(3): 1127-1134. |
[14] | Jiang Mingxin, Pan Zhigeng, WangLanfang, Hu Taoxin. Visual Object Tracking Algorithm Based on Deep Denoising Autoencoder over RGB-D Data [J]. Journal of System Simulation, 2018, 30(11): 4276-4283. |
[15] | Yang Jiabo, Yang Gang, Yang Meng. Augmented Reality System Based on Depth Image Segmentation and Object Tracking [J]. Journal of System Simulation, 2017, 29(11): 2788-2795. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||