Cargando…

Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes

High-throughput sequencing technologies are a milestone in molecular biology for facilitating great advances in genomics by enabling the deposit of large volumes of biological data to public databases. The availability of such data has made possible the comparative genomic analysis through pipelines...

Descripción completa

Detalles Bibliográficos
Autores principales: Veras, Allan, Araujo, Fabricio, Pinheiro, Kenny, Guimarães, Luis, Azevedo, Vasco, Soares, Siomar, da Costa da Silva, Artur, Ramos, Rommel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6018222/
https://www.ncbi.nlm.nih.gov/pubmed/29942087
http://dx.doi.org/10.1038/s41598-018-27800-8
_version_ 1783334909224419328
author Veras, Allan
Araujo, Fabricio
Pinheiro, Kenny
Guimarães, Luis
Azevedo, Vasco
Soares, Siomar
da Costa da Silva, Artur
Ramos, Rommel
author_facet Veras, Allan
Araujo, Fabricio
Pinheiro, Kenny
Guimarães, Luis
Azevedo, Vasco
Soares, Siomar
da Costa da Silva, Artur
Ramos, Rommel
author_sort Veras, Allan
collection PubMed
description High-throughput sequencing technologies are a milestone in molecular biology for facilitating great advances in genomics by enabling the deposit of large volumes of biological data to public databases. The availability of such data has made possible the comparative genomic analysis through pipelines, using the entire gene repertoire of genomes. However, a large number of unfinished genomes exist in public databases; their number is approximately 16-fold higher than the number of complete genomes, which creates bias during comparative analyses. Therefore, the present work proposes a new tool called Pan4Drafts, an automated pipeline for pan-genomic analysis of draft prokaryotic genomes to maximize the representation and accuracy of the gene repertoire of unfinished genomes by using reads from sequencing data. Pan4Draft allows to perform comparative analyses using different methodologies such as combining complete and draft genomes, using only draft genomes or only complete genomes. Pan4Draft is available at http://www.computationalbiology.ufpa.br/pan4drafts and the test dataset is available at https://sourceforge.net/projects/pan4drafts.
format Online
Article
Text
id pubmed-6018222
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-60182222018-07-06 Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes Veras, Allan Araujo, Fabricio Pinheiro, Kenny Guimarães, Luis Azevedo, Vasco Soares, Siomar da Costa da Silva, Artur Ramos, Rommel Sci Rep Article High-throughput sequencing technologies are a milestone in molecular biology for facilitating great advances in genomics by enabling the deposit of large volumes of biological data to public databases. The availability of such data has made possible the comparative genomic analysis through pipelines, using the entire gene repertoire of genomes. However, a large number of unfinished genomes exist in public databases; their number is approximately 16-fold higher than the number of complete genomes, which creates bias during comparative analyses. Therefore, the present work proposes a new tool called Pan4Drafts, an automated pipeline for pan-genomic analysis of draft prokaryotic genomes to maximize the representation and accuracy of the gene repertoire of unfinished genomes by using reads from sequencing data. Pan4Draft allows to perform comparative analyses using different methodologies such as combining complete and draft genomes, using only draft genomes or only complete genomes. Pan4Draft is available at http://www.computationalbiology.ufpa.br/pan4drafts and the test dataset is available at https://sourceforge.net/projects/pan4drafts. Nature Publishing Group UK 2018-06-25 /pmc/articles/PMC6018222/ /pubmed/29942087 http://dx.doi.org/10.1038/s41598-018-27800-8 Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Veras, Allan
Araujo, Fabricio
Pinheiro, Kenny
Guimarães, Luis
Azevedo, Vasco
Soares, Siomar
da Costa da Silva, Artur
Ramos, Rommel
Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title_full Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title_fullStr Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title_full_unstemmed Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title_short Pan4Draft: A Computational Tool to Improve the Accuracy of Pan-Genomic Analysis Using Draft Genomes
title_sort pan4draft: a computational tool to improve the accuracy of pan-genomic analysis using draft genomes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6018222/
https://www.ncbi.nlm.nih.gov/pubmed/29942087
http://dx.doi.org/10.1038/s41598-018-27800-8
work_keys_str_mv AT verasallan pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT araujofabricio pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT pinheirokenny pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT guimaraesluis pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT azevedovasco pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT soaressiomar pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT dacostadasilvaartur pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes
AT ramosrommel pan4draftacomputationaltooltoimprovetheaccuracyofpangenomicanalysisusingdraftgenomes