Advances and Applications in Statistics
Volume 33, Issue 1, Pages 63 - 81
(March 2013)
|
|
COMPARISON OF CLUSTERING ALGORITHMS: AN EXAMPLE WITH PROTEOMIC DATA
Nairanjana Dasgupta, Yibing Chen, Ananth Kalyanaraman and Sayed Daoud
|
Abstract: Clustering has become the norm in analyzing genomic and proteomic data. There are various algorithms for which graphs and software are easily available. However, often there is little agreement among these methods; prompting the question of which algorithm should one choose? In this manuscript, we consider eleven different methods of clustering, including partitioning, hierarchical and model based methods, to look at proteomic data from a colon cancer study. We use a pairwise index to evaluate these eleven clustering methods by comparing the clustering results. In an effort to comprehensively understand and determine the best method under a given structure, a simulation study is performed to compare the eleven methods under different distributional structures. The data analysis and the simulation study results agree that clustering is dependent on the method chosen. However, our study suggests that k-means and model based methods tend to perform reasonably well under all data structures used. |
Keywords and phrases: model based clustering, k-means, hierarchical clustering, colon cancer. |
|
Number of Downloads: 354 | Number of Views: 1072 |
|