Cargando…

Multi-CAR: a tool of contig scaffolding using multiple references

BACKGROUND: A draft genome assembled by current next-generation sequencing techniques from short reads is just a collection of contigs, whose relative positions and orientations along the genome being sequenced are unknown. To further obtain its complete sequence, a contig scaffolding process is usu...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Kun-Tze, Chen, Cheih-Jung, Shen, Hsin-Ting, Liu, Chia-Liang, Huang, Shang-Hao, Lu, Chin Lung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260120/
https://www.ncbi.nlm.nih.gov/pubmed/28155633
http://dx.doi.org/10.1186/s12859-016-1328-7
_version_ 1782499347567476736
author Chen, Kun-Tze
Chen, Cheih-Jung
Shen, Hsin-Ting
Liu, Chia-Liang
Huang, Shang-Hao
Lu, Chin Lung
author_facet Chen, Kun-Tze
Chen, Cheih-Jung
Shen, Hsin-Ting
Liu, Chia-Liang
Huang, Shang-Hao
Lu, Chin Lung
author_sort Chen, Kun-Tze
collection PubMed
description BACKGROUND: A draft genome assembled by current next-generation sequencing techniques from short reads is just a collection of contigs, whose relative positions and orientations along the genome being sequenced are unknown. To further obtain its complete sequence, a contig scaffolding process is usually applied to order and orient the contigs in the draft genome. Although several single reference-based scaffolding tools have been proposed, they may produce erroneous scaffolds if there are rearrangements between the target and reference genomes or their phylogenetic relationship is distant. This may suggest that a single reference genome may not be sufficient to produce correct scaffolds of a draft genome. RESULTS: In this study, we design a simple heuristic method to further revise our single reference-based scaffolding tool CAR into a new one called Multi-CAR such that it can utilize multiple complete genomes of related organisms as references to more accurately order and orient the contigs of a draft genome. In practical usage, our Multi-CAR does not require prior knowledge concerning phylogenetic relationships among the draft and reference genomes and libraries of paired-end reads. To validate Multi-CAR, we have tested it on a real dataset composed of several prokaryotic genomes and also compared its accuracy performance with other multiple reference-based scaffolding tools Ragout and MeDuSa. Our experimental results have finally shown that Multi-CAR indeed outperforms Ragout and MeDuSa in terms of sensitivity, precision, genome coverage, scaffold number and scaffold N50 size. CONCLUSIONS: Multi-CAR serves as an efficient tool that can more accurately order and orient the contigs of a draft genome based on multiple reference genomes. The web server of Multi-CAR is freely available at http://genome.cs.nthu.edu.tw/Multi-CAR/.
format Online
Article
Text
id pubmed-5260120
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-52601202017-01-30 Multi-CAR: a tool of contig scaffolding using multiple references Chen, Kun-Tze Chen, Cheih-Jung Shen, Hsin-Ting Liu, Chia-Liang Huang, Shang-Hao Lu, Chin Lung BMC Bioinformatics Research BACKGROUND: A draft genome assembled by current next-generation sequencing techniques from short reads is just a collection of contigs, whose relative positions and orientations along the genome being sequenced are unknown. To further obtain its complete sequence, a contig scaffolding process is usually applied to order and orient the contigs in the draft genome. Although several single reference-based scaffolding tools have been proposed, they may produce erroneous scaffolds if there are rearrangements between the target and reference genomes or their phylogenetic relationship is distant. This may suggest that a single reference genome may not be sufficient to produce correct scaffolds of a draft genome. RESULTS: In this study, we design a simple heuristic method to further revise our single reference-based scaffolding tool CAR into a new one called Multi-CAR such that it can utilize multiple complete genomes of related organisms as references to more accurately order and orient the contigs of a draft genome. In practical usage, our Multi-CAR does not require prior knowledge concerning phylogenetic relationships among the draft and reference genomes and libraries of paired-end reads. To validate Multi-CAR, we have tested it on a real dataset composed of several prokaryotic genomes and also compared its accuracy performance with other multiple reference-based scaffolding tools Ragout and MeDuSa. Our experimental results have finally shown that Multi-CAR indeed outperforms Ragout and MeDuSa in terms of sensitivity, precision, genome coverage, scaffold number and scaffold N50 size. CONCLUSIONS: Multi-CAR serves as an efficient tool that can more accurately order and orient the contigs of a draft genome based on multiple reference genomes. The web server of Multi-CAR is freely available at http://genome.cs.nthu.edu.tw/Multi-CAR/. BioMed Central 2016-12-23 /pmc/articles/PMC5260120/ /pubmed/28155633 http://dx.doi.org/10.1186/s12859-016-1328-7 Text en © The Author(s) 2016 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Chen, Kun-Tze
Chen, Cheih-Jung
Shen, Hsin-Ting
Liu, Chia-Liang
Huang, Shang-Hao
Lu, Chin Lung
Multi-CAR: a tool of contig scaffolding using multiple references
title Multi-CAR: a tool of contig scaffolding using multiple references
title_full Multi-CAR: a tool of contig scaffolding using multiple references
title_fullStr Multi-CAR: a tool of contig scaffolding using multiple references
title_full_unstemmed Multi-CAR: a tool of contig scaffolding using multiple references
title_short Multi-CAR: a tool of contig scaffolding using multiple references
title_sort multi-car: a tool of contig scaffolding using multiple references
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260120/
https://www.ncbi.nlm.nih.gov/pubmed/28155633
http://dx.doi.org/10.1186/s12859-016-1328-7
work_keys_str_mv AT chenkuntze multicaratoolofcontigscaffoldingusingmultiplereferences
AT chencheihjung multicaratoolofcontigscaffoldingusingmultiplereferences
AT shenhsinting multicaratoolofcontigscaffoldingusingmultiplereferences
AT liuchialiang multicaratoolofcontigscaffoldingusingmultiplereferences
AT huangshanghao multicaratoolofcontigscaffoldingusingmultiplereferences
AT luchinlung multicaratoolofcontigscaffoldingusingmultiplereferences