系统仿真学报 ›› 2018, Vol. 30 ›› Issue (6): 2102-2109.doi: 10.16182/j.issn1004731x.joss.201806013

• 仿真建模理论与方法 • 上一篇    下一篇

基于图数据模型的聚类方法及可信度检测

程艳云1, 边荟凇1, 边长生2   

  1. 1. 南京邮电大学自动化学院,江苏 南京 210023;
    2. 南京信息职业技术学院,江苏 南京 210023
  • 收稿日期:2016-07-20 修回日期:2016-09-17 出版日期:2018-06-08 发布日期:2018-06-14
  • 作者简介:程艳云(1974-),女,江苏姜堰,硕士,副教授,研究方向为大数据在移动通信网络中的应用及网络优化;边荟凇(1989-),男,山东德州,硕士生,研究方向为聚类分析与张量异常块检测。
  • 基金资助:
    江苏省省级现代服务业(软件产业)发展专项引导资金(SJ214038)

Clustering Method Based on Graph Data Model and Reliability Detection

Cheng Yanyun1, Bian Huisong1, Bian Changsheng2   

  1. 1. Nanjing University Of Posts And Telecommunications, Nanjing 210023, China;
    2. Nanjing College of Information Technology, Nanjing 210023, China
  • Received:2016-07-20 Revised:2016-09-17 Online:2018-06-08 Published:2018-06-14

摘要: 对于特征空间中的数据,传统聚类算法通常直接在特征空间中进行聚类分析,因此高维空间数据无法在二维平面实现直观有效的聚类结果图形可视化,图数据可以明确反映对象之间的相似性关系,根据数据对象之间的距离,通过迭代将特征空间的数据建模成图数据。并对建模得到的图数据模型进行基于模块性的聚类分析,实现对非凸球分布数据集的聚类及对聚类结果实现二维空间的图形可视化。提出了聚类结果关于类间邻近边界的可信度概念,并提出了一种利用PageRank算法实现对聚类结果可信度计算的方法。

关键词: 数据挖掘, 聚类, 图数据建模, 模块性, PageRank算法

Abstract: For the data in feature space, traditional clustering algorithm can take clustering analysis directly. High-dimensional spatial data cannot achieve intuitive and effective graphical visualization of clustering results in 2D plane. Graph data can clearly reflect the similarity relationship between objects. According to the distance of the data objects, the feature space data are modeled as graph data by iteration. Cluster analysis based on modularity is carried out on the modeling graph data. The two-dimensional visualization of non-spherical-shape distribution data cluster and result is achieved. The concept of credibility of the clustering result is proposed, and a method is proposed, which the Page Rank algorithm is used to calculate the reliability of clustering results.

Key words: data mining, clustering, graph data modeling, modularity, Page Rank algorithm

中图分类号: