- 1、本文档共8页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
Introduction
SOAPdenovo is a novel short-read assembly method that can build a draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way.
SOAPdenovo是一种新型的short-read装配方法,可以建立一个de novo组装人l类大小的基因组草案。这个程序是为装配Illumina测序 short reads特别设计的。它以一种高效益的方式为建立参考序列和计算出精确的未知基因组创造了新的机会。
System Requirement
SOAPdenovo aims for large plant and animal genomes, although it also works well on bacteria and fungi genomes. It runs on 64-bit Linux system with a minimum of 5G physical memory. For big genomes like human, about 150 GB memory would be required.
SOAPdenovo虽然也能在细菌和真菌基因组也能很好的运行但它的目标是大的植物和动物的基因组。它运行在最小内存5G的64位Linux系统上。像人类的大基因组,大约需要150G内存。
Download
Release 1.0
Only precompiled binary version available now.
只有预编译的二进制版本
Installation
Download the SOAPdenovo tar package;
下载压缩包
Unpack it;
解压
There are one executable file “soapdenovo”, one demo configure file “example.contig” and this readme file.
有一个可执行文件soapdenovo,一个演示配置文件“example.contig”和这个readme文件。
How to use it
如何使用
Configuration file
配置文件
For big genome projects with deep sequencing, the data is usually organized as multiple read sequence files generated from multiple libraries. The configuration file tells the assembler where to find these files and the relevant information. “example.config” is an example of such a file.
对于深度测序的大的基因组项目,数据通常是多重函数库生成的多重read序列文件。配置文件告诉汇编如何寻找这些文件以及相关信息。“example.config”就是一个这样文件的例子。
The configuration file has a section for global information, and then multiple library sections. Right now only “max_rd_len” is included in the global information section. Any read longer than max_rd_len will be cut to this length.
这些配置文件有一个部分是总体信息,然后是多重文库部分。现在只有“max_rd_len”是包括在总体信息的部分。任何长于最大可读长度的read将被剪切到这个长度。
The library information and the information of sequencing data generated from the library should be organized in the cor
文档评论(0)