Advances and Applications in Statistics
Volume 6, Issue 3, Pages 377 - 398
(December 2006)
|
|
CLAPPER: A CLUSTERING ALGORITHM BASED ON PATH-DISTANCES FOR EXPERIMENTAL REPLICATES
I. Irigoien (Spain), E. Fernandez (Spain), S. Vives (Spain) and C. Arenas (Spain)
|
Abstract: Cluster analysis has proven to be a useful tool for investigating the structure of microarray data. This paper describes the application of a novel partitioning cluster procedure for clustering gene expression data with experimental replicates. A new measure of cluster density, called "path distance" is introduced and in the approach both the mean and between-variability of replicates are considered. This should make more efficient use of the measured data and allow us to obtain reliable results. The algorithm gives in each step a partition in two clusters and no prior assumptions on the structure of clusters or gene probability distribution are required. It assigns each object to only one cluster and gives the global extreme (not local) of the function to be optimized. The method obtains outstanding results. We have tested it on synthetic data and on a real data providing an easy-to-use bioinformatics solution for the clustering analysis of two colors microarrays experiments constrained to low replication. The program implementing the algorithm is available upon request from the authors. |
Keywords and phrases: clustering analysis, cluster density, path distance, replicate experiments. |
|
Number of Downloads: 369 | Number of Views: 1270 |
|