Identification of Regulatory Modules by
Co-clustering  Latent Variable Models: Stem Cell Differentiation

Authors  Je-Gun Joung1, Dongho Shin3, Rho Hyun Seong3,  and Byoung-Tak Zhang 1,2

1Center for Bioinformation Technology, 2School of Computer Science and Engineering and 3Research Center for Functional Cellulomics, Institute of Molecular Biology and Genetics and Department of Biological Sciences, Seoul National University, Seoul 151-742, Republic of Korea

Abstract

Motivation: An important issue in stem cell biology is to understand how to direct differentiation towards a specific cell type. To elucidate the mechanism, previous studies have focused on identifying the responsible gene regulators, which have, however, failed to provide a systemic view of regulatory modules. To obtain a unified description of the regulatory modules, we characterized major stem cell species by employing a co-clustering latent variable model (LVM). The LVM-based method allowed us to elucidate the cell type-specific transcription factors, using genomic sequences as well as expression profiles.

Results: We used a list of genes enriched in each of 21 stem cell subpopulations, and their upstream genomic sequences. The LVMbased study allowed us to uncover the regulatory modules for each stem cell cluster, e.g. GABP and E2F for the proliferation phase, and Ap2¥á and Ap2¥ã for the quiescence phase. Furthermore, the identities of the stem cell clusters were well revealed by the constituent genes that were directly targeted by the modules. Consequently, our analytical framework was demonstrated to be useful through a detailed case study of stem cell differentiation and can be applied to problems with similar characteristics.

Collection of gene sets representing major stem cell populations (Excel)

Screening transcription factor binding motifs (pdf)