Enhanced SMART Framework For Gene Clustering Using Successive Processing
Rui Fa, Basel Abu-Jamous, David J. Roberts, Asoke K. Nandi

In this paper, we develop an enhanced splitting merging awareness tactics (E-SMART) framework using successive processing. Instead of selecting the best clustering from the results by using clustering selection criterion in original SMART framework, we introduce a successive processing strategy into the framework to subtract clusters one by one in iterations. In doing so, the silhouette index is employed to evaluate the intermediate clusters and order them according to their index values from high to low. Then we subtract the best cluster from the original dataset and iterate the remaining dataset back to the splitting-while-merging (SWM) process to start a new iteration. The clustering and subtracting are repeated successively and terminated automatically, once no splitting happened in the SWM process. Consequently, all clusters can be obtained by iterations. We implement the framework using component-wise expectation maximization (CEM) for finite mixture models (FMM). The E-SMART-FMM implementation is tested in real NCI-60 cancer dataset. We evaluate the clustering results from the proposed algorithm, together with two existing self-splitting algorithms, using two popular validation indices other than the silhouette index. The results of both validation indices consistently demonstrate that E-SMART-FMM is superior to the existing algorithms.