Advances and Applications in Statistics
Volume 14, Issue 2, Pages 191 - 204
(February 2010)
|
|
SOME STATISTICAL PROPERTIES OF GENE EXPRESSION CLUSTERING FOR ARRAY DATA
G. C. G. Abreu, A. Pinheiro, R. D. Drummond, S. R. Camargo and M. Menossi
|
Abstract: DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to cDNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https://ipe.cbmeg.unicamp.br/pub/abreu.gcg. Code implementation in R is in progress. |
Keywords and phrases: array data, genomics, bootstrap resampling. |
|
Number of Downloads: 382 | Number of Views: 1103 |
|