Journal of System Simulation ›› 2022, Vol. 34 ›› Issue (10): 2204-2212.doi: 10.16182/j.issn1004731x.joss.21-0484

• Modeling Theory and Methodology • Previous Articles     Next Articles

A K-modes Clustering Method Based on Maximal Information Coefficient Data Preprocessing

Mingmei Li1(), Chenglin Wen1,2(), Shaolin Hu2   

  1. 1.Hangzhou Dianzi University, Hangzhou 310018, China
    2.Guangdong Institute of Petrochemical Technology, Maoming 525000, China
  • Received:2021-05-26 Revised:2021-08-06 Online:2022-10-30 Published:2022-10-18
  • Contact: Chenglin Wen E-mail:851628184@qq.com;wencl@hdu.edu.cn

Abstract:

The existing k-modes clustering method ignores the weak correlation of variable attributes, which often results in poor clustering performance in practical applications. A new k-modes clustering method that includes the weak correlation of attributes is proposed. Maximum information coefficient (MIC) is introduced to measure the correlation of variable attributes in the data set. The obtained MIC value is merged with the original distance to establish a new measurement method containing weak attribute correlation information to enhance the completeness of related information of variable attributes, and a more refined k-modes clustering method is established. Three different data sets are used to compare the performance of the new method with the existing k-modes clustering and other improved k-modes clustering methods, the simulation results shows the effectness of the new method.

Key words: clustering algorithm, k-modes, maximum information coefficient(MIC), distance metric, variable attribute

CLC Number: