non-negative matrix factorization for learning alignment-specific models of protein evolution非负矩阵分解学习alignment-specific蛋白质进化模型.pdfVIP

  • 7
  • 0
  • 约7.83万字
  • 约 11页
  • 2017-09-01 发布于上海
  • 举报

non-negative matrix factorization for learning alignment-specific models of protein evolution非负矩阵分解学习alignment-specific蛋白质进化模型.pdf

non-negative matrix factorization for learning alignment-specific models of protein evolution非负矩阵分解学习alignment-specific蛋白质进化模型

Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution 1,2 2 2 2 3 2 Ben Murrell , Thomas Weighill , Jan Buys , Robert Ketteringham , Sasha Moola , Gerdus Benade , 2 3 3 2 Lise du Buisson , Daniel Kaliski , Tristan Hands , Konrad Scheffler * 1 Biomedical Informatics Research Division, eHealth Research and Innovation Platform, Medical Research Council, Cape Town, Western Cape, South Africa, 2 Stellenbosch University, Stellenbosch, Western Cape, South Africa, 3 University of Cape Town, Cape Town, Western Cape, South Africa Abstract Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of in

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档