Abstract.
Microarray experiments are being increasingly used in molecular biology. A common task is to detect genes with differential expression across two experimental conditions, such as two different tissues or the same tissue at two time points of biological development. To take proper account of statistical variability, some statistical approaches based on the t-statistic have been proposed. In constructing the t-statistic, one needs to estimate the variance of gene expression levels. With a small number of replicated array experiments, the variance estimation can be challenging. For instance, although the sample variance is unbiased, it may have large variability, leading to a large mean squared error. For duplicated array experiments, a new approach based on simple averaging has recently been proposed in the literature. Here we consider two more general approaches based on nonparametric smoothing. Our goal is to assess the performance of each method empirically. The three methods are applied to a colon cancer data set containing 2,000 genes. Using two arrays, we compare the variance estimates obtained from the three methods. We also consider their impact on the t-statistics. Our results indicate that the three methods give variance estimates close to each other. Due to its simplicity and generality, we recommend the use of the smoothed sample variance for data with a small number of replicates.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Electronic Publication
Rights and permissions
About this article
Cite this article
Huang, X., Pan, W. Comparing three methods for variance estimation with duplicated high density oligonucleotide arrays. Funct Integr Genomics 2, 126–133 (2002). https://doi.org/10.1007/s10142-002-0066-2
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s10142-002-0066-2