Pushpa Publishing House

Important: All future articles and volumes will be published only on our new website: pphmjopenaccess.com. Authors are requested to submit their papers through the new website only. Visit now: pphmjopenaccess.com

Advances and Applications in Statistics

Advances and Applications in Statistics
Volume 14, Issue 2, Pages 191 - 204 (February 2010)

SOME STATISTICAL PROPERTIES OF GENE EXPRESSION CLUSTERING FOR ARRAY DATA

G. C. G. Abreu, A. Pinheiro, R. D. Drummond, S. R. Camargo and M. Menossi

Abstract:

DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to cDNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https://ipe.cbmeg.unicamp.br/pub/abreu.gcg. Code implementation in R is in progress.

Keywords and phrases:

array data, genomics, bootstrap resampling.

Number of Downloads: 578 | Number of Views: 1740