基于DIVA模型的中文复合元音发音方法研究

doi:10.16182/j.issn1004731x.joss.201702004

系统仿真学报 ›› 2017, Vol. 29 ›› Issue (2): 255-263.doi: 10.16182/j.issn1004731x.joss.201702004

基于DIVA模型的中文复合元音发音方法研究

张少白, 陈燕俐, 何利文

南京邮电大学计算机学院,江苏南京 210046

收稿日期:2016-03-15 修回日期:2016-08-09 出版日期:2017-02-08 发布日期:2020-06-01
第一作者简介:张少白(1953-),男,河北定县,博士,教授,研究方向为智能系统与模式识别;陈燕俐(1969-),女,湖北襄阳,博士,教授,研究方向为智能系统与模式识别。
基金资助:
国家自然科学基金(61271334, 61373065)

Research of Chinese Diphthongs Pronunciation Based on DIVA Model

Zhang Shaobai, Chen Yanli, He Liwen

Computer Department, Nanjing University of Posts and Telecommunications, Nanjing 210046, China

Received:2016-03-15 Revised:2016-08-09 Online:2017-02-08 Published:2020-06-01

摘要/Abstract

摘要： DIVA(Directions Into Velocities of Articulators)模型是一种被用来对涉及大脑中有关语音生成和理解区域的功能进行仿真和描述的自适应神经网络模型,其依赖的语言背景是英文29个基本音素。由于汉语与英语发音区别很大,且加工脑机制也颇为不同,要想将汉语者大脑思维过程“阅读”出来,需要对模型汉语背景的适应性进行专门研究。在DIVA模型的基础上研究汉语复合元音的发音方法,探讨汉语者脑区语音生成与获取的相关问题。通过调节模型的共振峰以及模拟声道对应器官的参数,新构建的模型能很好地辨识汉语与英语元音的区别。该研究为DIVA模型汉语语音生成与获取奠定了良好的基础。

关键词: DIVA模型, 语音发音, 汉语复合元音, LPMCC

Abstract: DIVA (Directions Into Velocities of Articulators) is a kind of adaptive neural network model which is used to simulate and describe some associative functions in brain regions involved speech production and understanding. DIVA takes 29 essential English phonemes as its language background. Since the number of Chinese pronunciation phonemes is much larger than English and the pronunciation brain mechanisms of both also make a big difference, the adaptability of DIVA model for Chinese background has to be studied specially, in order that the model can “read out” the thinking processes in Chinese brain. Based on DIVA, the Chinese pronunciation of diphthongs was explored and related issues on Chinese brain regions involved speech production and acquisition were discussed. The new modified model can distinguish Chinese vowels from English vowels clearly by adjusting formant and the parameters of the corresponding pronunciation organs in DIVA's simulative vocal tract. This research lays a solid foundation for further comprehensive Chinese speech production and acquisition on DIVA model.

Key words: DIVA model, speech sound, Chinese compound vowel, LPMCC

中图分类号:

TP183

张少白,陈燕俐,何利文 . 基于DIVA模型的中文复合元音发音方法研究[J]. 系统仿真学报, 2017, 29(2): 255-263.

Zhang Shaobai,Chen Yanli,He Liwen . Research of Chinese Diphthongs Pronunciation Based on DIVA Model[J]. Journal of System Simulation, 2017, 29(2): 255-263.

参考文献

[1] Tourville J T, Guenther F H.The DIVA Model: A Neural Theory of Speech Acquisition and Production[J]. Language and Cognitive Processes (S0169-0965), 2011, 25(7): 952-981.
[2] Ghosh S S.Understanding Cortical Contributions to Speech Production through Modeling and Functional Imaging [D]. USA: Boston University, 2005.
[3] Guenther F H.A neural network model of speech acquisition and motor equivalent speech production[J]. Biological Cybernetics (S0340-1200), 1994, 72(1): 43-53.
[4] Shaobai Zhang, Zheng Zhang, Xin Liu.Research on the Chinese vowels Pronunciation method Based on DIVA model[J]. WIT Transactions on Modelling and Simulation (S1743-355X), 2014, 60(1):809-818.
[5] Zhang S B, Ji Y C, He L W.Research on the Mechanism for Phonating Stressed English Syllables Based on DIVA Model[J]. Neurocomputing (S0925-2312), 2015, 152(3): 11-18.
[6] 张少白, 王勇, 何利文. 汉语神经分析系统研究现状与展望[J]. 中国科学:信息科学, 2015, 45(7): 849-868.
(Zhang Shaobai, Wang Yong, He Liwen. Research status and prospect of Chinese Neuralynx System [J]. Scientia Sinica (Informationis) , 2015, 45(7): 849-868.).
[7] 刘欣, 基于DIVA模型的汉语元音发音方法及口吃成因研究 [D]. 南京: 南京邮电大学, 2013.
(Liu Xin.Research on the Method of Chinese vowels Pronunciations and the Causes of Stuttering Based on DIVA model [D]. Nanjing, China: Nanjing University of Posts and Telecommunications, 2013.)
[8] He Feng, Chen Xiaoqing, Li Guosuo, et al.A New Calculation Method of Extracting Formant[J]. Signal Processing (S0165-1684), 2007, 23(4): 618-621.
[9] 金星, 万萍, 吴绪波, 等. 痉挛型与手足徐动型脑性瘫痪患儿口咽腔共鸣功能的比较研究[J]. 中国康复理论与实践, 2012, 18(10): 913-915.
(Jin Xing, Wan Ping, Wu Xubo, et al.Comparison of Oral-pharyngeal Resonance Function between Spastic and Athetoid Cerebral Palsy Children[J]. Chinese Journal of Rehabilitation Theory and Practice, 2012, 18(10): 913-915.)
[10] Bush B O, Kain A.Estimating Phoneme Formant Targets and Coarticulation Parameters of Conversational and Clear Speech[C]// Acoustics, Speech and Signal Processing (ICASSP). Canada: IEEE, 2013: 8017-8021.
[11] Lee S, Potamianos A, Narayanan S.Developmental Aspects of American English Diphthong Trajectories in the Formant Space[C]// Proceedings of Meetings on Acoustics. USA: Acoustical Society of America, 2013, 19(1): 60-67.
[12] M Zbancioc, Marius, Mihaela Costin. Using Neural Networks and LPCC to Improve Speech Recognition[C]// International Symposium on Signals, Circuits and Systems. USA: IEEE, 2003, 3(3): 256-274.
[13] Yujin Yuan, Zhao Peihua, Zhou Qun.Research of speaker recognition based on combination of LPCC and MFCC[C]// Intelligent Computing and Intelligent Systems (ICIS), 2010 IEEE International Conference on. USA: IEEE, 2010.
[14] Wang Wei.One Factor Repeated Measures ANOVA Analysis of Gene Expression Profile Data Based on Database Language[J]. China Medical Devices (S1674-1633), 2013, 28(11): 34-36.
[15] Yu Hang, Zhou Qiang.Intra-Chunk Relationship Analyse for Chinese Base Chunk Labeling Systems[J]. Journal of Tsinghua University (Sci & Tech)(S1000-0054), 2009, 49(10): 1708-1711.
[16] Cruys T V, Moir'on B V.Lexico-Semantic Multiword Expression Extraction[C]// Proceedings of the 17th Meeting of Computational Linguistics, Netherlands: Computational Linguistics. Belgium: Lot Occasional, 2007: 175-190.
[17] 徐歆冰. 基于DIVA神经网络模型模拟语音感知神经机制研究 [D]. 南京: 南京邮电大学, 2014.
(Xu Xinbing.Simulation of Neural Mechanisms for Speech Perception with DIVA Neural Network Model [D]. Nanjing, China: Nanjing University of Posts and Telecommunications, 2014.)
[18] Guenther F H, Gjaja M N.The Perceptual Magnet Effect as an Emergent Property of Neural Map Formation[J]. Journal of the Acoustical Society of America (S1520-8524), 1996, 100(2): 1111-1121.

基于DIVA模型的中文复合元音发音方法研究

Research of Chinese Diphthongs Pronunciation Based on DIVA Model

PDF (PC)

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	李秋妮, 王栋, 王超哲, 刘棕成. 多特征连续时序BiLSTM+Attention空战目标意图预测方法[J]. 系统仿真学报, 2026, 38(4): 948-958.
[2]	吕金虎, 蒋弘毅, 刘德元, 谭少林. 基于图神经网络的复杂系统建模与仿真[J]. 系统仿真学报, 2025, 37(7): 1624-1638.
[3]	郑岚月, 张玉洁. 基于改进YOLOv7的交通信号灯检测[J]. 系统仿真学报, 2025, 37(4): 993-1007.
[4]	陈静, 杨国威, 张昭冲, 王伟. 基于时空多视野注意残差网络的城市区域交通流量预测[J]. 系统仿真学报, 2025, 37(3): 607-622.
[5]	姜嘉成, 贾政轩, 徐钊, 林廷宇, 赵芃芃, 欧一鸣. 基于博弈对抗复杂系统的决策建模与求解[J]. 系统仿真学报, 2025, 37(1): 66-78.
[6]	侯顺虎, 方胜良, 曾庆尧, 王孟涛. 基于RIS的元素分组面状全连接网络[J]. 系统仿真学报, 2024, 36(4): 1017-1027.
[7]	刘东阳, 查文文, 陶亮, 朱诚, 辜丽川, 焦俊. 基于LSTM和SMC的农用履带机器人轨迹跟踪控制[J]. 系统仿真学报, 2023, 35(4): 747-759.
[8]	冯增喜, 赵锦彤, 李诗妍, 杨亚龙, 陈海越, 张聪. 一种量子磷虾群融合算法及其应用[J]. 系统仿真学报, 2022, 34(10): 2142-2151.
[9]	郭业才, 王庆伟. 基于截断迁移与并行残差网络的调制识别算法[J]. 系统仿真学报, 2022, 34(09): 2009-2018.
[10]	张森, 张孟炎, 邵敬平, 普杰信. 基于随机策略搜索的多机三维路径规划方法[J]. 系统仿真学报, 2022, 34(6): 1286-1295.
[11]	王步维, 王敏, 范谦, 王雅男, 章涵文, 乐云亮. 基于深度学习的晶体性质预测研究[J]. 系统仿真学报, 2021, 33(12): 2854-2863.
[12]	冯晓, 张辉, 周蕊, 乔璐, 魏东, 李丹丹, 张玉尧, 郑国清. 基于深度学习和籽粒双面特征的玉米品种识别[J]. 系统仿真学报, 2021, 33(12): 2983-2991.
[13]	仝卫国, 庞雪纯, 朱赓宏. 基于卷积神经网络的气液两相流流型识别方法[J]. 系统仿真学报, 2021, 33(4): 883-891.
[14]	冯新扬, 邵超. 跨卷积网络特征融合的SAR图像目标识别[J]. 系统仿真学报, 2021, 33(3): 554-561.
[15]	邢志伟, 李彪, 朱慧, 罗谦. 基于深度神经网络的航班保障时间预测研究[J]. 系统仿真学报, 2020, 32(4): 678-686.