联合聚类非线性相关的时序基因表达数据.pdfVIP

  • 4
  • 0
  • 约2.31万字
  • 约 9页
  • 2017-08-20 发布于北京
  • 举报

联合聚类非线性相关的时序基因表达数据.pdf

计算 机研 究 与发展 ISSN 1000—1239/CN 11-1777/TP JournalofComputerResearchandDevelopment 45(11):1865—1873,2008 联合聚类非线性相关的时序基因表达数据 闫雷鸣 孙志挥 吴英杰 张柏礼 (东南大学计算机科学与工程学 院 南京 210096) (yan — leiming:@163.corn) BiclusteringNonlinearlyCorrelatedTimeSeriesGeneExpressionData YanLeiming,SunZhihui,WuYingjie,andZhangBaili (SchoolofComputerScience&Engineering,SoutheastUniversity,Nanjing210096) Abstract The biclustering algorithms focus on clustering correlated patterns in sub—spaces. However,mostofthebiclusteringalgorithmsnowadaysaddressonlythelinearlycorrelatedpatternor acertain linearly similarpattern,leaving thenonlinearly correlatedpatternsuntouched,which are oftenhiddeninagreatmanyofrealdatasets.Inthispaper,anovelbiclusteringalgorithm calledM I— TSB isproposedto findandreportallnonlinearlycorrelatedpatternsin timeseriesgeneexpression data.Itfirstdeducesan efficientcalculatingformulaofquadraticmutualinformationwith matrix theory,andthen basedon thequadraticmutualinformation and slidingwindow technology,a time seriesdatanonlinearlysimilarmodelandasimplegeneralsuffixtreevariationversionareintroduced. Usingsuffixtreeasindex structure,theM I~TSB algorithm exploresallofbiclusterseffectivelyand efficiently.Comparedwithgeneralbiclusteringalgorithms,theability ofdiscovering thenonlinearly correlatedpatterns in sliding window isone of themost importantadvantagesof the M I—TSB algorithm.-Additionally,experimentsonrealgeneexpressiondatasetandsyntheticdatasetshow that theM I—TSB algorithm successfully discoverssomenonlinearly correlatedpatternswhich can notbe found by other ordinary biclustering algorithms. Besides, gene annotating by gene ontology demonstratesthattheM I—TSBalgorithm canfindbiologicallymeaningfulresults. Keywords quadraticmutualinformation;nonlinearcorrelation;bielustering;bioinformactics;gene expressiondata 摘 要 为聚类非线

文档评论(0)

1亿VIP精品文档

相关文档