BNFO 136Sequence alignment.ppt

下载文档

3
0
约2.88千字
约 12页
2017-08-13 发布于甘肃
举报
版权申诉
保障服务

BNFO 136Sequence alignment.ppt

1、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。

BNFO 136Sequence alignment Usman Roshan Pairwise alignment X: ACA, Y: GACAT Match=8, mismatch=2, gap-5 ACA-- -ACA- --ACA ACA---- GACAT GACAT GACAT G--ACAT 8+2+2-5-5 -5+8+8+8-5 -5-5+2+2+2 2-5-5-5-5-5-5 Score = 2 14 -4 -28 Traceback We can compute an alignment of DNA (or protein or RNA) sequences X and Y with a traceback matrix T. Sequence X is aligned along the rows and Y along the columns. Each entry of the matrix T contains D, L, or U specifying diagonal, left or upper Traceback X: ACA, Y=TACAG Traceback X: ACA, Y=TACAG Traceback code aligned_seq1 = aligned_seq2 = i = len(seq2) j = len(seq1) while(i !=0 or j != 0): if(T[i][j] == “L”): aligned_seq1 = “-” + aligned_seq1 aligned_seq1 = seq1[j-1] + aligned_seq1 j = j - 1 elif(T[i][j] == U): aligned_seq1 = - + aligned_seq1 aligned_seq2 = seq2[i-1] + aligned_seq2 i = i - 1 else: aligned_seq1 = seq1[j-1] + aligned_seq1 aligned_seq2 = seq2[i-1] + aligned_seq2 i = i - 1 j = j - 1 Optimal alignment An alignment can be specified by the traceback matrix. How do we determine the traceback for the highest scoring alignment? Needleman-Wunsch algorithm for global alignment First proposed in 1970 Widely used in genomics/bioinformatics Dynamic programming algorithm Needleman-Wunsch (NW) Input: X = x1x2…xn, Y=y1y2…ym (X is seq2 and Y is seq1) Notation: X1..i = x1x2…xi Score(X1..i,Y1..j) = Optimal alignment score of sequences X1..i and Y1..j. Suppose we know the optimal alignment scores of X1…i-1 and Y1…j-1 X1…i and Y1...j-1 X1...i-1 and Y1…j Needleman-Wunsch (NW) Then the optimal alignment score of X1…i and Y1…j is the maximum of Score(X1…i-1,Y1…j-1) + match/mismatch Score(X1…i,Y1…j-1) + gap Score(X1…i-1,Y1…j) + gap We build on this observation to compute Score(Xn,Ym) Needleman-Wunsch Define V to be a two dimensional matrix with len(X)+1 rows and len(Y)+1 columns Let V[i][j] be the score of the optimal alignment of X1…i and Y1…j. Let m be the match cost,