Journal of System Simulation ›› 2026, Vol. 38 ›› Issue (1): 174-188.doi: 10.16182/j.issn1004731x.joss.25-0863

• Papers • Previous Articles     Next Articles

DEHPR: A Diffusion-based End-to-end Hand Pose Reconstruction Network

Liao Guoqiong1,2, Huang Longjie3, Li Qingxin3, Zhang Jiajun1, Chen Kefan1   

  1. 1.Modern Industry School of Virtual Reality (VR), Jiangxi University of Finance and Economics, Nanchang 330032, China
    2.Jiangxi Tourism and Commerce Vocational College, Nanchang 330100, China
    3.School of Computing and Artificial Intelligence, Jiangxi University of Finance and Economics, Nanchang 330032, China
  • Received:2025-09-07 Revised:2025-11-19 Online:2026-01-18 Published:2026-01-28
  • Contact: Huang Longjie

Abstract:

Traditional methods such as convolutional neural networks (CNNs) and Transformers suffer from strong dependence on large-scale annotated data and limited generalization capability when dealing with hand pose reconstruction in complex scenarios. To address these issues, a diffusion-based end-to-end hand pose reconstruction network (DEHPR) is proposed. This method employs a diffusion model to directly generate and refine 3D predictions, thereby reducing spatial uncertainties inherent in 2D-to-3D modeling paradigms. By incorporating an end-to-end framework that reprojects multiple 3D candidate predictions to select optimal joint positions, the approach ultimately produces accurate hand pose estimations. Comprehensive evaluations conducted on HO3D V2, DexYCB, and FreiHand datasets demonstrate that DEHPR achieves superior performance compared to existing methods. The proposed solution effectively diminishes dependency on large-scale annotated data, mitigates uncertainties in indirect 2D-to-3D modeling from single RGB images, and consequently enhances both accuracy and robustness in hand pose reconstruction.

Key words: diffusion model, end-to-end, hand posture, hand occlusion, pose reconstruction

CLC Number: