系统仿真学报 ›› 2025, Vol. 37 ›› Issue (3): 646-656.doi: 10.16182/j.issn1004731x.joss.23-1300

• 论文 • 上一篇    

融合变异萤火虫算法的三支聚类方法

李兆彬1, 叶军1,2, 周浩岩1, 汪一心3, 韩宇贞1,2   

  1. 1.南昌工程学院 信息工程学院,江西 南昌 330000
    2.江西省水信息协同感知与智能处理重点实验室(南昌工程学院),江西 南昌 330000
    3.江西开放大学,江西 南昌 330000
  • 收稿日期:2023-10-29 修回日期:2023-12-20 出版日期:2025-03-17 发布日期:2025-03-21
  • 通讯作者: 叶军
  • 第一作者简介:李兆彬(1998-),男,硕士生,研究方向为粗糙集理论、聚类算法、数据挖掘。
  • 基金资助:
    国家自然科学基金(61562061);江西省教育厅科技项目(GJJ211920)

Three-way Decision Clustering Algorithm Fusion of Mutant Fireflies Algorithm

Li Zhaobin1, Ye Jun1,2, Zhou Haoyan1, Wang Yixin3, Han Yuzhen1,2   

  1. 1.College of Information Engineering, Nanchang Institute of Engineering, Nanchang 330000, China
    2.Jiangxi Province Key Laboratory of Water Information Cooperative Sensing and Intelligent Processing(Nanchang Institute of Engineering), Nanchang 330000, China
    3.Jiangxi Open University, Nanchang 330000, China
  • Received:2023-10-29 Revised:2023-12-20 Online:2025-03-17 Published:2025-03-21
  • Contact: Ye Jun

摘要:

为解决三支聚类算法随机选取初始聚类中心会导致算法出现早熟现象,以及q近邻概念中q的取值需要通过不断重复实验得到等问题,提出一种变异萤火虫优化的三支聚类算法。通过萤火虫算法来解决初始中心点敏感的问题,以目标函数值作为萤火虫光亮强度进行聚类中心点的搜索,将求得的最优解作为算法的聚类中心进行迭代;提出边界域归属度公式以及自适应阈值,使得边界域中样本满足阈值条件情况下尽可能地划分到核心域当中,避免了边界域样本过多的问题。通过UCI数据集实验结果表明:改进后的算法大幅降低了迭代次数,提高了聚类结果准确率,也验证了该算法的稳定性和有效性。

关键词: 聚类算法, K-means聚类, 三支决策, 萤火虫算法, 变异策略

Abstract:

To address problems such as the premature phenomenon in the three-way clustering algorithm caused by the random selection of initial cluster centers and the need for repeated experiments to determine the value of q in the q-nearest neighbor concept, a three-way clustering algorithm optimized by a variant of the firefly algorithm is proposed. The firefly algorithm is employed to solve the problem of sensitivity to initial cluster centers. The target function value is taken as the brightness intensity of firefly to search the clustering center point, and the optimal solution is taken as the clustering center of the algorithm for iteration. The boundary domain attribution formula and adaptive threshold value are proposed, so that the samples in the boundary domain can be divided into the core domain as far as possible if they meet the threshold condition, avoiding the problem of too many boundary domain samples. The experimental results on the UCI datasets show that the improved algorithm significantly reduces the number of iterations, improves the accuracy of the clustering results, and verifies the stability and effectiveness of the algorithm。

Key words: clustering algorithm, K-means clustering, three-way decision clustering, firefly algorithm, mutation strategy

中图分类号: