Cargando…

Assembling draft genomes using contiBAIT

SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or...

Descripción completa

Detalles Bibliográficos
Autores principales: O’Neill, Kieran, Hills, Mark, Gottlieb, Mike, Borkowski, Matthew, Karsan, Aly, Lansdorp, Peter M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860061/
https://www.ncbi.nlm.nih.gov/pubmed/28475666
http://dx.doi.org/10.1093/bioinformatics/btx281
_version_ 1783307939678781440
author O’Neill, Kieran
Hills, Mark
Gottlieb, Mike
Borkowski, Matthew
Karsan, Aly
Lansdorp, Peter M
author_facet O’Neill, Kieran
Hills, Mark
Gottlieb, Mike
Borkowski, Matthew
Karsan, Aly
Lansdorp, Peter M
author_sort O’Neill, Kieran
collection PubMed
description SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies. AVAILABILITY AND IMPLEMENTATION: contiBAIT is available on Bioconductor. Source files available from GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5860061
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-58600612018-03-23 Assembling draft genomes using contiBAIT O’Neill, Kieran Hills, Mark Gottlieb, Mike Borkowski, Matthew Karsan, Aly Lansdorp, Peter M Bioinformatics Applications Notes SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies. AVAILABILITY AND IMPLEMENTATION: contiBAIT is available on Bioconductor. Source files available from GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-09-01 2017-05-05 /pmc/articles/PMC5860061/ /pubmed/28475666 http://dx.doi.org/10.1093/bioinformatics/btx281 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Notes
O’Neill, Kieran
Hills, Mark
Gottlieb, Mike
Borkowski, Matthew
Karsan, Aly
Lansdorp, Peter M
Assembling draft genomes using contiBAIT
title Assembling draft genomes using contiBAIT
title_full Assembling draft genomes using contiBAIT
title_fullStr Assembling draft genomes using contiBAIT
title_full_unstemmed Assembling draft genomes using contiBAIT
title_short Assembling draft genomes using contiBAIT
title_sort assembling draft genomes using contibait
topic Applications Notes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860061/
https://www.ncbi.nlm.nih.gov/pubmed/28475666
http://dx.doi.org/10.1093/bioinformatics/btx281
work_keys_str_mv AT oneillkieran assemblingdraftgenomesusingcontibait
AT hillsmark assemblingdraftgenomesusingcontibait
AT gottliebmike assemblingdraftgenomesusingcontibait
AT borkowskimatthew assemblingdraftgenomesusingcontibait
AT karsanaly assemblingdraftgenomesusingcontibait
AT lansdorppeterm assemblingdraftgenomesusingcontibait