:::

[2017-09-26] Prof.Ting Chen,Tsinghua University Beijing, "Large-Scale Metagenomic Sequence Clustering and Inference of Environment-Microbe and Microbe-Microbe Associations

專題討論演講公告
張貼人:Seminar專用帳號2公告日期:2017-09-22

Title: Large-Scale Metagenomic Sequence Clustering and Inference of Environment-Microbe and Microbe-Microbe Associations
Date: 2017-09-26 10:30 am-11:30 am
Location: R107, CSIE
Speaker: Prof.Ting Chen, Tsinghua University Beijing
Hosted by: Prof. Kun-Mao Chao

 

Abstract

Underlying an environmental sample from, i.e., marine, fresh water, soil and human body, the diversity of the microbial community can be answered by the identities of the taxonomic units, and their abundance levels.  With the advancements of next-generation sequencing technology, it is now possible to directly sequence DNAs obtained from environmental samples.  In this talk, we focus on the targeted 16S rRNA gene sequencing that directly profiles the diversity of the microbial communities. We present an unsupervised Bayesian clustering method for clustering 16S rRNA for taxonomic prediction, and we then speed it up to cluster billions of sequences.

Understanding associations among microbes and associations between microbes and their environmental factors from metagenomic sequencing data is a key research topic in microbial ecology, which could help us to unravel real interactions (e.g., commensalism, parasitism, competition, etc.) in a community as well as understanding community-wide dynamics. Although several statistical tools have been developed for metagenomic association studies, they either suffer from compositional bias or fail to take into account environmental factors that directly affect the composition of a microbial community, leading to some false positive associations. Here, we propose metagenomic Lognormal-Dirichlet-Multinomial (mLDM), a hierarchical Bayesian model with sparsity constraints to bypass compositional bias and discover new associations among microbes and associations between microbes and their environmental factors. The mLDM model is able to: 1) infer both conditionally independent associations among microbes and direct associations between microbes and environmental factors; 2) consider both compositional bias and variance of metagenomic data; and 3) estimate absolute abundance for microbes. Thus, conditionally independent associations can capture the direct relationships underlying pairs of microbes and remove the indirect connections induced from other common factors.



Biography

Dr. Ting Chen is currently Qianren Professor of computer science at Tsinghua University, and director of Center for Big Data Research in Medicine and Health, Institute of Data Science.  He is also a faculty member at the Bioinformatics Division, Tsinghua National Laboratory of Information Science and Technology.  He graduated from Tsinghua University in 1993 with B.E. in computer science, and received his Ph.D. in computer science at SUNY Stony Brook in 1997.  He was a lecturer of genetics at Harvard University from 1997 to 2000, and then become an assistant professor, associate professor and professor of biological sciences and computer science at the University of Southern California (USC).

His research interests are in the areas of computational biology/bioinformatics, medical informatics, algorithms, and statistical learning.  His current research topics include (1) medical data analysis and intelligent medicine, (2) single cell RNA sequencing data analysis, (3) human genotype and phenotype association, and (4) human microbial interactions, functions and identifications.  He received the Sloan Research Fellowship in 2004.  He has published over 100 papers with >10000 citations (Google Scholar) and with 19 papers with >100 citations.


 

 

 

 

最後修改時間:2017-09-22 PM 3:23

cron web_use_log