- 1、本文档共12页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
Motif identification neural design for rapid and sensitive protein family search
MOTIF IDENTIFICATION NEURAL DESIGN
FOR RAPID AND SENSITIVE PROTEIN FAMILY SEARCH
Cathy H. Wu, Hsi-Lien Chen, Chin-Ju Lo and Jerry W. McLarty
Department of Epidemiology/Biomathematics
The University of Texas Health Center at Tyler
Tyler, TX 75710
Abstract
The accelerated growth of the molecular sequencing data has generated a pressing
need for advanced sequence annotation tools. This paper reports a new method,
termed MOTIFIND (Motif Identification Neural Design), for rapid and sensitive
protein family identification. The method is extended from our previous gene
classification artificial neural system and employs two new designs to enhance the
detection of distant relationships. These include an n-gram term weighting algorithm
for extracting local motif patterns, and integrated neural networks for combining
global and local sequence information. The system has been tested with three protein
families of electron transferases, namely cytochrome c, cytochrome b and
flavodoxin, with a 100% sensitivity and more than 99.6% specificity. The accuracy
of MOTIFIND is comparable to the BLAST database search method, but its speed is
more than 20 times faster. The system is much more robust than the PROSITE
search which is based on simple signature patterns. MOTIFIND also compares
favorably with the BLIMPS search of BLOCKS in detecting fragmentary sequences
lacking complete motif regions. The method has the potential to become a full-scale
database search and sequence analysis tool.
Introduction
As technology improves and molecular sequencing data accumulate nearly
exponentially, progress in the Human Genome Project will depend increasingly on
the development of advanced computational tools for rapid and accurate annotation of
genomic sequences. Currently, a database search for sequence similarities is the
most direct computational means of deciphering codes that connect molecular
sequences with protein structure and function [Doolittle, 1990]. There are good
algorithm
您可能关注的文档
- Lesson19 I Like Fruit!.ppt
- Lesson77 terrible toothache.ppt
- Let Xbeanemptyset.ObservethatcardXisempty. Onecancheckthateverybinaryrelationwhichisnatural.pdf
- letter for analysis exercises.pdf
- Lesson C shadow play.ppt
- Lethality of Taser Weapons.pdf
- Leukocyte_derived matrix metalloproteinase_9 mediates blood_brain barrier breakdown and is.pdf
- Lexicon for the Sensory Description of French Bre.pdf
- Life of the nodal quasiparticles in Bi-2212 as seen by ARPES.pdf
- Liftech-钢结构细节规定.pdf
文档评论(0)