Journal of System Simulation ›› 2024, Vol. 36 ›› Issue (12): 2834-2849.doi: 10.16182/j.issn1004731x.joss.24-FZ0797

• Papers • Previous Articles    

Research on Latent Space-based Anime Face Style Transfer and Editing Techniques

Deng Haixin1, Zhang Fengquan1, Wang Nan1, Zhang Wancai2, Lei Jierui3   

  1. 1.Beijing University of Posts and Telecommunications, Beijing 100876, China
    2.NARI Technology Co. , Ltd, Nanjing 211106, China
    3.North China University of Technology, Beijing 100144, China
  • Received:2024-07-20 Revised:2024-09-29 Online:2024-12-20 Published:2024-12-20
  • Contact: Zhang Fengquan

Abstract:

To address issues such as image distortion and style uniformity in existing anime style transfer networks within the field of image simulation, we propose the TGFE-TrebleStyleGAN (text-guided facial editing with TrebleStyleGAN) for anime facial style transfer and editing. This framework leverages vector guidance within the latent space to generate facial imagery and incorporates a detail control module and a feature control module to constrain the aesthetic attributes of the generated images. The images generated by the transfer network serve as style control signals and constraints for fine-grained segmentation. Text-to-image generation technology captures correlations between style-transferred images and semantic information. Experimental results on both open-source datasets and self-constructed datasets with paired attribute tags for anime faces demonstrate that the proposed model reduces the FID score by 2.819 compared to DualStyleGAN, improve the SSIM and NIMA scores by 0.028 and 0.074 respectively. Combining style transfer and editing retains anime facial details while allowing flexible adjustments, minimizing distortion and enhancing feature consistency and style similarity.

Key words: anime style transfer, GAN, latent space, anime facial editing, text-guided image generation

CLC Number: