系统仿真学报 ›› 2021, Vol. 33 ›› Issue (11): 2753-2759.doi: 10.16182/j.issn1004731x.joss.21-FZ0708

• 快报/短文 • 上一篇    

融合自监督学习的单帧图像运动视差关键点估计

霍志浩1, 金炜东1,2,*, 唐鹏1   

  1. 1.西南交通大学 电气工程学院,四川 成都 611756;
    2.南宁学院 中国-东盟综合交通国际联合实验室,广西 南宁 530200
  • 收稿日期:2021-04-30 修回日期:2021-08-18 出版日期:2021-11-18 发布日期:2021-11-17
  • 通讯作者: 金炜东(1959-),男,博士,博导,教授,研究方向为智能信息处理。E-mail:wdjin@home.swjtu.edu.cn
  • 作者简介:霍志浩(1996-),男,硕士生,研究方向为深度学习。E-mail:171937824@qq.com
  • 基金资助:
    国家重点研发计划(2016YFB200401-102F)

Single-frame Image Motion Parallax Key Point Estimation Combined with Self-supervised Learning

Huo Zhihao1, Jin Weidong1,2,*, Tang Peng1   

  1. 1. School of Electric Engineering, Southwest Jiaotong University, Chengdu 611576, China;
    2. China-ASEAN International Joint Laboratory of Integrated Transport, Nanning University, Guangxi 530200, China
  • Received:2021-04-30 Revised:2021-08-18 Online:2021-11-18 Published:2021-11-17

摘要: 运动视差的关键点(Focus of Expansion,FOE)是铁路接触网视频巡检的重要参数,但当前计算FOE的方法需多帧图像匹配估计,时间复杂度高。针对单帧图像FOE估计问题,结合自监督学习思想,提出了一种融合自监督学习的单帧图像FOE估计算法。搭建了全卷积网络F-VGG (Fully-Visual Geometry Group)作为FOE的预测器,通过融合代理任务自动生成样本数据的训练标签,实现了端到端的单帧图像FOE估计。实验结果表明:该方法在FOE预测精度上平均提升13.45%,检测速度提升56.27%,适于实时应用。

关键词: 自监督学习, 运动视差, 全卷积网络, FOE (Focus of Expansion)

Abstract: The motion parallax key point FOE (Focus of Expansion) is an important parameter of railway catenary video inspection. The current method of calculating FOE requires multi-frame image matching estimation, which has high time complexity. Aiming at the single-frame image FOE estimation, a single-frame image FOE estimation algorithm fused with self-supervised learning is proposed. A full convolutional network F-VGG(Fully-Visual Geometry Group) is built as the FOE predictor, and the training label of the sample data is automatically generated through the fusion agent task, which realizes the end-to-end single-frame image FOE estimation. The experimental results show that the method has an average increase of 13.45% in FOE prediction accuracy, and an increase of 56.27% in detection speed, which is suitable for real-time applications.

Key words: self-supervised learning, motion parallax, fully convolutional network, Focus of Expansion, (FOE)

中图分类号: