Dense Video Description Method Based on Multi-modal Fusion in Transformer Network
Li Xiang, Sang Haifeng
Journal of System Simulation . 2024, (5): 1061 -1071 .  DOI: 10.16182/j.issn1004731x.joss.23-0017