Title:
A Bayesian Mixture Model for Differential Gene Expression
Speaker:
Kim-Anh Do, Department of Biostatistics & Applied Mathematics, The University of Texas M.D. Anderson
Subject:
Analysis of Gene Expression Data: Principles and Applications
Area:
Medicine
Type of school:
university
School name:
ohio state university
Country:
United States
Course language:
English
Course media:
Video
Course duration:
Contributor:
pbp
Comments:
Speaker: Kim-Anh Do, Department of Biostatistics & Applied Mathematics, The University of Texas M.D. Anderson Cancer Center
Collaborators: Peter Mueller and Feng Tang
Title: A Bayesian Mixture Model for Differential Gene Expression
Presentation Materials: PDF
Streaming Video: Real Media
Model-based inference is proposed for differential gene expression, using a non-parametric Bayesian probability model for the distribution of gene intensities under different conditions. The probability model is
essentially a mixture of normals. Specifically, it is a variation of traditional Dirichlet process (DP) mixture models. The model includes an additional mixture corresponding to the assumption that transcription levels arise as a mixture over non-differentially and differentially expressed genes.
Inference proceeds as in DP mixture models, with an additional set of latent indicators to resolve this additional mixture. The use of fully model-based inference mitigates some of the necessary limitations of the empirical Bayes method (Efron, JASA 2001). However, the increased generality of our method comes at a price. Computation is not as straightforward as in the empirical Bayes scheme. But we argue that inference is no more difficult than posterior simulation in a traditional nonparametric mixture of normal models. We illustrate the proposed method in two examples, including a simulation study and a a microarray experiment to screen for genes with differential expression in colon cancer versus normal tissue.
We will illustrate the ease of making joint inference about a sub-group of genes being differentially expressed and of estimating the total number of significantly expressing genes. Further, we also elaborate on how the control of false positive rates can be automatically incorporated into this approach.
pbp