Cargando…

RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes

The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Wanfei, Wu, Shuangyang, Lin, Qiang, Gao, Shenghan, Ding, Feng, Zhang, Xiaowei, Aljohi, Hasan Awad, Yu, Jun, Hu, Songnian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364042/
https://www.ncbi.nlm.nih.gov/pubmed/30583062
http://dx.doi.org/10.1016/j.gpb.2018.03.006
_version_ 1783393196202524672
author Liu, Wanfei
Wu, Shuangyang
Lin, Qiang
Gao, Shenghan
Ding, Feng
Zhang, Xiaowei
Aljohi, Hasan Awad
Yu, Jun
Hu, Songnian
author_facet Liu, Wanfei
Wu, Shuangyang
Lin, Qiang
Gao, Shenghan
Ding, Feng
Zhang, Xiaowei
Aljohi, Hasan Awad
Yu, Jun
Hu, Songnian
author_sort Liu, Wanfei
collection PubMed
description The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost.
format Online
Article
Text
id pubmed-6364042
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-63640422019-02-15 RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes Liu, Wanfei Wu, Shuangyang Lin, Qiang Gao, Shenghan Ding, Feng Zhang, Xiaowei Aljohi, Hasan Awad Yu, Jun Hu, Songnian Genomics Proteomics Bioinformatics Application Note The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost. Elsevier 2018-10 2018-12-21 /pmc/articles/PMC6364042/ /pubmed/30583062 http://dx.doi.org/10.1016/j.gpb.2018.03.006 Text en © 2018 The Authors http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Application Note
Liu, Wanfei
Wu, Shuangyang
Lin, Qiang
Gao, Shenghan
Ding, Feng
Zhang, Xiaowei
Aljohi, Hasan Awad
Yu, Jun
Hu, Songnian
RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title_full RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title_fullStr RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title_full_unstemmed RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title_short RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes
title_sort rgaat: a reference-based genome assembly and annotation tool for new genomes and upgrade of known genomes
topic Application Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364042/
https://www.ncbi.nlm.nih.gov/pubmed/30583062
http://dx.doi.org/10.1016/j.gpb.2018.03.006
work_keys_str_mv AT liuwanfei rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT wushuangyang rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT linqiang rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT gaoshenghan rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT dingfeng rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT zhangxiaowei rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT aljohihasanawad rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT yujun rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes
AT husongnian rgaatareferencebasedgenomeassemblyandannotationtoolfornewgenomesandupgradeofknowngenomes