第二代测序的基本数据处理.pdfVIP

  • 302
  • 0
  • 约7.2千字
  • 约 34页
  • 2019-02-18 发布于江西
  • 举报
第二代测序的基本数据处理 杜政霖 duzhl@ ― ― 1. 基本数据处理 • Resequencing – Reads mapping – SNP, Indel, Structure variations – Exon capture: Nimblegene, SureSelect • de novo sequencing – Reads assembly – Genome/Transcriptome ― 2. 第二代测序平台数据 • illumina Hiseq2000(solexa) – Reads length: 50-100bp – format: fastq • AB SOLiD – Reads length: 50bp – format: csfasta • Roche GS FLX (454) – Reads length: ~400bp – Format: sff/fasta ― 2.1 Solexa—fastq格式 Flowcell ID Lane# Tile# X Y Reads1/2 Reads Sequence Reads Quality ― 2.1 Solexa—fastq格式 The standard Sanger variant to assess reliability of a base call, otherwise known as Phred quality score The Solexa pipeline (i.e., the software delivered with the Illumina Genome Analyzer) earlier used a different mapping, encoding the odds ratio p/(1-p) instead of the probability p : Qv=ASII(char)-64 or Qv=ASII(char)-33 ― 2.2 Solid—csfasta格式 Reads Sequence Reads Quality ― 2.3 fasta format Sequence ID Sequence ― 3. Reads Alignment reference reads Consensus SNP Indel Structure Variations

文档评论(0)

1亿VIP精品文档

相关文档