TY - JOUR KW - Clustering Quality Indexes KW - Generalized Mean KW - K-Nearest Neighbors KW - S-distance KW - S-divergence KW - Spectral Clustering KW - Symmetry Favored AU - Krishna Kumar Sharma AU - Ayan Seal AU - Anis Yazidi AU - Ondrej Krejcar AB - A clustering validation index (CVI) is employed to evaluate an algorithm’s clustering results. Generally, CVI statistics can be split into three classes, namely internal, external, and relative cluster validations. Most of the existing internal CVIs were designed based on compactness (CM) and separation (SM). The distance between cluster centers is calculated by SM, whereas the CM measures the variance of the cluster. However, the SM between groups is not always captured accurately in highly overlapping classes. In this article, we devise a novel internal CVI that can be regarded as a complementary measure to the landscape of available internal CVIs. Initially, a database’s clusters are modeled as a non-parametric density function estimated using kernel density estimation. Then the S-divergence (SD) and S-distance are introduced for measuring the SM and the CM, respectively. The SD is defined based on the concept of Hermitian positive definite matrices applied to density functions. The proposed internal CVI (PM) is the ratio of CM to SM. The PM outperforms the legacy measures presented in the literature on both superficial and realistic databases in various scenarios, according to empirical results from four popular clustering algorithms, including fuzzy k-means, spectral clustering, density peak clustering, and density-based spatial clustering applied to noisy data. IS - Regular Issue M1 - 4 N2 - A clustering validation index (CVI) is employed to evaluate an algorithm’s clustering results. Generally, CVI statistics can be split into three classes, namely internal, external, and relative cluster validations. Most of the existing internal CVIs were designed based on compactness (CM) and separation (SM). The distance between cluster centers is calculated by SM, whereas the CM measures the variance of the cluster. However, the SM between groups is not always captured accurately in highly overlapping classes. In this article, we devise a novel internal CVI that can be regarded as a complementary measure to the landscape of available internal CVIs. Initially, a database’s clusters are modeled as a non-parametric density function estimated using kernel density estimation. Then the S-divergence (SD) and S-distance are introduced for measuring the SM and the CM, respectively. The SD is defined based on the concept of Hermitian positive definite matrices applied to density functions. The proposed internal CVI (PM) is the ratio of CM to SM. The PM outperforms the legacy measures presented in the literature on both superficial and realistic databases in various scenarios, according to empirical results from four popular clustering algorithms, including fuzzy k-means, spectral clustering, density peak clustering, and density-based spatial clustering applied to noisy data. PY - 2023 SE - 127 SP - 127 EP - 139 T2 - International Journal of Interactive Multimedia and Artificial Intelligence TI - S-Divergence-Based Internal Clustering Validation Index UR - https://www.ijimai.org/journal/sites/default/files/2023-11/ijimai8_4_12.pdf VL - 8 SN - 1989-1660 ER -