系统仿真学报 ›› 2019, Vol. 31 ›› Issue (9): 1755-1762.doi: 10.16182/j.issn1004731x.joss.19-0401

• 专栏:复杂系统建模与仿真 • 上一篇    下一篇

基于拓扑数据分析的高维数据隐藏模式挖掘

刘擘龙1, 李喆2   

  1. 1. 新疆大学电气工程学院,新疆 乌鲁木齐 830047;
    2. 新疆大学网络与信息技术中心,新疆 乌鲁木齐 830046
  • 收稿日期:2019-07-31 修回日期:2019-08-02 发布日期:2019-12-12
  • 通讯作者: 李喆(1977-),女,江苏,硕士,副教授,研究方向为大数据分析。
  • 作者简介:刘擘龙(1993-),男,甘肃,硕士生,研究方向为深度学习、数据挖掘
  • 基金资助:
    国家自然科学基金(51767022, 51575469)

High-dimensional data hiding pattern mining based on topology data analysis

Liu Bolong1, Li Zhe2   

  1. 1. School of Electrical Engineering, Xinjiang University, Urumqi 830047, China;
    2. Network and Information Technology Center, Xinjiang University, Urumqi 830046, China
  • Received:2019-07-31 Revised:2019-08-02 Published:2019-12-12

摘要: 针对传统数据分析方法寻找高维复杂数据间隐藏模式存在局限性的问题,提出基于拓扑数据分析的高维数据隐藏模式挖掘方法,通过提取复杂高维数据的特征,分析其形状和样本的相互关系来获得数据集隐藏模式。利用拓扑数据分析对高维数据集-声音的性别识别进行实例验证, 同时对数据集数据子组以及相关数据子组之间关系进行可视化分析,结果表明所提方法可发现传统方法无法发现的数据子组之间隐含的关系和模式,得到了比传统方法更精细有效的结果,验证了所提方法对高维数据隐藏模式挖掘的强大性和有效性。

关键词: 拓扑数据分析, 隐藏模式挖掘, 高维数据

Abstract: Aiming at the limitation of traditional data analysis methods to find hidden patterns between high-dimensional complex data, a method of high-dimensional data hiding pattern mining based on topological data analysis is proposed. By extracting the characteristics of complex high-dimensional data, the relationship between its shapes and samples is analyzed. To get the dataset hidden mode, the topological data analysis is used to verify the gender recognition of high-dimensional dataset-voice.. At the same time, the relationship between the dataset data subgroups and related data subgroups is visually analyzed. The results show that the implicit relationship and pattern between data subgroups can be found by the proposed method, which cannot be found by traditional methods and it is more detailed and effective than traditional methods. The results also verify the power and effectiveness of the proposed method for high-dimensional data hiding mode mining.

Key words: topological data analysis, hidden pattern mining, high dimensional data

中图分类号: