01/04/2007 Biology Medicine
DOI: 10.1097/MOL.0b013e3280895d6f SemanticScholar ID: 27319553 MAG: 2030074178

Identification of differentially expressed genes and false discovery rate in microarray studies

Publication Summary

Purpose of review To highlight the development in microarray data analysis for the identification of differentially expressed genes, particularly via control of false discovery rate. Recent findings The emergence of high-throughput technology such as microarrays raises two fundamental statistical issues: multiplicity and sensitivity. We focus on the biological problem of identifying differentially expressed genes. First, multiplicity arises due to testing tens of thousands of hypotheses, rendering the standard P value meaningless. Second, known optimal single-test procedures such as the t-test perform poorly in the context of highly multiple tests. The standard approach of dealing with multiplicity is too conservative in the microarray context. The false discovery rate concept is fast becoming the key statistical assessment tool replacing the P value. We review the false discovery rate approach and argue that it is more sensible for microarray data. We also discuss some methods to take into account additional information from the microarrays to improve the false discovery rate. Summary There is growing consensus on how to analyse microarray data using the false discovery rate framework in place of the classical P value. Further research is needed on the preprocessing of the raw data, such as the normalization step and filtering, and on finding the most sensitive test procedure.

CAER Authors

Avatar Image for Arief Gusnanto

Dr. Arief Gusnanto

University of Leeds

