高通量测序数据分析现状及挑战.pdfVIP

下载本文档

61
0
约 5页
2017-06-04 发布于湖北
举报

高通量测序数据分析现状及挑战.pdf

第 1 卷第3期集成技术 Vol. 1 No. 3 2012年9月 JOURNAL OF INTEGRATION TECHNOLOGY Sep. 2012 高通量测序数据分析现状与挑战张文力1,2 1 (中国科学院计算技术研究所北京 100190) 2 (计算机体系结构国家重点实验室北京 100190) 摘要基因是遗传的物质基础。生物体的生、长、病、老、死等一切生命现象都与基因有关。基因测序是解读生命的一种途径。随着新一代高通量测序技术的发展，每天会产生TB甚至更多的序列数据。合理诠释这些大规模及复杂高维度的数据成为获取数据后一个更大的难点，是当前生物研究的关键步骤，具有巨大的现实意义。海量高通量测序数据的存储、处理和分析都极大地挑战着当前的计算机系统和计算模式。本文将结合调研情况，尤其是华大基因的实例调研，讨论当前高通量测序数据分析的现状、问题和多方采取的措施。然而，面对高通量测序数据带来的挑战，仍需要多方密切合作和长久深入的研究。关键词基因组；高通量测序；数据分析；云计算；工作流 Status and Challenges on Data Analysis of High Throughput Sequencing ZHANG Wen-li1,2 1( Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China ) 2( State Key Laboratory of Computer Architecture, Beijing 100190, China ) Abstract Gene is the genetic material basis. All life phenomena, like disease and death, are related to Gene. Gene sequencing is a way to read life. With the development of new generation high-throughput sequencing technology, TB or more sequence data will be generated daily. It’s more difficult to interpret these big and complex data than to acquire them. Sequence data interpretation is a critical step in current biological research and has great practical significance. It’s a great challenge for current computer systems and computing models to store, process and analysis massive high throughput sequence data. With survey, especially from BGI (Beijing Genome Institute), the current status, problems and measures taken to process high throughput sequence data will be discussed. However, the challenge is too big to be solved unless

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

高通量测序数据分析现状及挑战.pdfVIP