- 1、本文档共29页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
hmm介绍说明.ppt
Hidden Markov Models (HMMs) (Lecture for CS397-CXZ Algorithms in Bioinformatics) Feb. 20, 2004 ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign Motivation: the CpG island problem Methylation in human genome “CG” - “TG” happens in most place except “start regions” of genes CpG islands = 100-1,000 bases before a gene starts Questions Q1: Given a short stretch of genomic sequence, how would we decide if it comes from a CpG island or not? Q2: Given a long sequence, how would we find the CpG islands in it? Answer to Q1: Bayes Classifier Hypothesis space: H={HCpG, HOther} Evidence: X=“ATCGTTC” Likelihood of evidence (Generative Model) Prior probability We need two generative models for sequences: p(X| HCpG), p(X|HOther) A Simple Model for Sequences:p(X) Probability rule Assume independence Capture some dependence P(x|HCpG) P(A|HCpG)=0.25 P(T|HCpG)=0.25 P(C|HCpG)=0.25 P(G|HCpG)=0.25 P(x|HOther) P(A|HOther)=0.25 P(T|HOther)=0.40 P(C|HOther)=0.10 P(G|HOther)=0.25 X=ATTG Vs. X=ATCG Answer to Q2: Hidden Markov Model CpG Island X=ATTGATGCAAAAGGGGGATCGGGCGATATAAAATTTG Other Other How can we identify a CpG island in a long sequence? Idea 1: Test each window of a fixed number of nucleitides Idea2: Classify the whole sequence Class label S1: OOOO………….……O Class label S2: OOOO…………. OCC … Class label Si: OOOO…OCC..CO…O … Class label SN: CCCC……………….CC S*=argmaxS P(S|X) = argmaxS P(S,X) S*=OOOO…OCC..CO…O CpG HMM is just one way of modeling p(X,S)… A simple HMM Parameters Initial state prob: p(B)= 0.5; p(I)=0.5 State transition prob: p(B?B)=0.8 p(B?I)=0.2 p(I?B)=0.5 p(I?I)=0.5 Output prob: P(a|B) = 0.25, … p(c|B)=0.10 … P(c|I) = 0.25 … P(B)=0.5 P(I)=0.5 P(x|B) B I 0.8 0.2 0.5 0.5 P(x|I) 0.8 0.2 0.5 0.5 P(x|HCpG)=p(x|I) P(a|I)=0.25 P(t|I)=0.25 P(c|I)=0.25 P(g|I)=0.25 P(x|HOther)=p(x|B) P(a|B)=0.25 P(t|B)=0.40 P(c|B)=0.10 P(g|B)=0.25 How to “Generate” a Sequence? B I 0.8 0.2 0.5 0.5 P(x|B) P(x|I) P(B)=0.5 P(I
文档评论(0)