bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions贝叶斯推理的基因组数据集成减少误分类率预测蛋白质相互作用.pdfVIP

  • 10
  • 0
  • 约9.07万字
  • 约 10页
  • 2017-08-31 发布于上海
  • 举报

bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions贝叶斯推理的基因组数据集成减少误分类率预测蛋白质相互作用.pdf

bayesian inference for genomic data integration reduces misclassification rate in predicting protein-protein interactions贝叶斯推理的基因组数据集成减少误分类率预测蛋白质相互作用

Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein- Protein Interactions 1 2 Chuanhua Xing *, David B. Dunson 1 Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, United States of America, 2 Department of Statistical Science, Duke University, Durham, North Carolina, United States of America Abstract Protein-protein interactions (PPIs) are essential to most fundamental cellular processes. There has been increasing interest in reconstructing PPIs networks. However, several critical difficulties exist in obtaining reliable predictions. Noticeably, false positive rates can be as high as .80%. Error correction from each generating source can be both time-consuming and inefficient due to the difficulty of covering the errors from multiple levels of data processing procedures within a single test. We propose a novel Bayesian integration method, deemed nonparametric Bayes ensemble learning (NBEL), to lower the misclassification rate (both false positives and negatives) through automatically up-weighting data sources that are most informative, while down-weighting less informative and biased sources. Extensive studies indicate that NBEL is significantly ¨ more robust than the classic naıve Bayes to unreliable, error-prone and contaminated data. On a large human data set our ¨ NBEL approach predicts many more PPIs than naıve Bayes. This suggests that previous studies may have large numbers of not only false positives but also false negatives. The validation on two human PPIs datasets having high quality supports our observations. Our experiments demonstrate that it is feasible to predict high-throughput

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档