bg7 a new approach for bacterial genome annotation designed for next generation sequencing databg7新方法对细菌基因组注释为下一代测序数据而设计的.pdfVIP

  • 5
  • 0
  • 约4.64万字
  • 约 6页
  • 2017-08-31 发布于上海
  • 举报

bg7 a new approach for bacterial genome annotation designed for next generation sequencing databg7新方法对细菌基因组注释为下一代测序数据而设计的.pdf

bg7 a new approach for bacterial genome annotation designed for next generation sequencing databg7新方法对细菌基因组注释为下一代测序数据而设计的

BG7: A New Approach for Bacterial Genome Annotation Designed for Next Generation Sequencing Data Pablo Pareja-Tobes, Marina Manrique, Eduardo Pareja-Tobes, Eduardo Pareja, Raquel Tobes* Oh no sequences! Research group, Era7 Bioinformatics, Granada, Spain Abstract BG7 is a new system for de novo bacterial, archaeal and viral genome annotation based on a new approach specifically designed for annotating genomes sequenced with next generation sequencing technologies. The system is versatile and able to annotate genes even in the step of preliminary assembly of the genome. It is especially efficient detecting unexpected genes horizontally acquired from bacterial or archaeal distant genomes, phages, plasmids, and mobile elements. From the initial phases of the gene annotation process, BG7 exploits the massive availability of annotated protein sequences in databases. BG7 predicts ORFs and infers their function based on protein similarity with a wide set of reference proteins, integrating ORF prediction and functional annotation phases in just one step. BG7 is especially tolerant to sequencing errors in start and stop codons, to frameshifts, and to assembly or scaffolding errors. The system is also tolerant to the high level of gene fragmentation which is frequently found in not fully assembled genomes. BG7 current version – which is developed in Java, takes advantage of Amazon Web Services (AWS) cloud computing features, but it can also be run locally in any operating system. BG7 is a fast, automated and scalable system that can cope with the challenge of analyzing the huge amount of genomes that are being sequenced with NGS technologies. Its capabilities

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档