A New Information Theoretic Clustering Algorithm Using K-NN
Vidar Vikjord, Robert Jenssen

We develop a new non-parametric hierarchical information theoretic clustering algorithm based on implicit estimation of cluster densities using k-nearest neighbors (k-nn). Compared to a kernel-based procedure, our k-nn approach is very robust with respect to the parameter choices, with a key ability to detect clusters of vastly different scales. Of particular importance is the use of two different values of k, depending on the evaluation of within-cluster entropy or across-cluster cross-entropy in order to obtain the final clustering. We conduct clustering experiments, and report promising results, focusing in particular on the proposed algorithm"s robustness to scale.