Cargando…
Assembling draft genomes using contiBAIT
SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860061/ https://www.ncbi.nlm.nih.gov/pubmed/28475666 http://dx.doi.org/10.1093/bioinformatics/btx281 |
_version_ | 1783307939678781440 |
---|---|
author | O’Neill, Kieran Hills, Mark Gottlieb, Mike Borkowski, Matthew Karsan, Aly Lansdorp, Peter M |
author_facet | O’Neill, Kieran Hills, Mark Gottlieb, Mike Borkowski, Matthew Karsan, Aly Lansdorp, Peter M |
author_sort | O’Neill, Kieran |
collection | PubMed |
description | SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies. AVAILABILITY AND IMPLEMENTATION: contiBAIT is available on Bioconductor. Source files available from GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5860061 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-58600612018-03-23 Assembling draft genomes using contiBAIT O’Neill, Kieran Hills, Mark Gottlieb, Mike Borkowski, Matthew Karsan, Aly Lansdorp, Peter M Bioinformatics Applications Notes SUMMARY: Massively parallel sequencing is now widely used, but data interpretation is only as good as the reference assembly to which it is aligned. While the number of reference assemblies has rapidly expanded, most of these remain at intermediate stages of completion, either as scaffold builds, or as chromosome builds (consisting of correctly ordered, but not necessarily correctly oriented scaffolds separated by gaps). Completion of de novo assemblies remains difficult, as regions that are repetitive or hard to sequence prevent the accumulation of larger scaffolds, and create errors such as misorientations and mislocalizations. Thus, complementary methods for determining the orientation and positioning of fragments are important for finishing assemblies. Strand-seq is a method for determining template strand inheritance in single cells, information that can be used to determine relative genomic distance and orientation between scaffolds, and find errors within them. We present contiBAIT, an R/Bioconductor package which uses Strand-seq data to repair and improve existing assemblies. AVAILABILITY AND IMPLEMENTATION: contiBAIT is available on Bioconductor. Source files available from GitHub. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-09-01 2017-05-05 /pmc/articles/PMC5860061/ /pubmed/28475666 http://dx.doi.org/10.1093/bioinformatics/btx281 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes O’Neill, Kieran Hills, Mark Gottlieb, Mike Borkowski, Matthew Karsan, Aly Lansdorp, Peter M Assembling draft genomes using contiBAIT |
title | Assembling draft genomes using contiBAIT |
title_full | Assembling draft genomes using contiBAIT |
title_fullStr | Assembling draft genomes using contiBAIT |
title_full_unstemmed | Assembling draft genomes using contiBAIT |
title_short | Assembling draft genomes using contiBAIT |
title_sort | assembling draft genomes using contibait |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860061/ https://www.ncbi.nlm.nih.gov/pubmed/28475666 http://dx.doi.org/10.1093/bioinformatics/btx281 |
work_keys_str_mv | AT oneillkieran assemblingdraftgenomesusingcontibait AT hillsmark assemblingdraftgenomesusingcontibait AT gottliebmike assemblingdraftgenomesusingcontibait AT borkowskimatthew assemblingdraftgenomesusingcontibait AT karsanaly assemblingdraftgenomesusingcontibait AT lansdorppeterm assemblingdraftgenomesusingcontibait |