Cargando…
Hybrid assembly with long and short reads improves discovery of gene family expansions
BACKGROUND: Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. METHODS: We developed a hybrid assembly pipeline called “Alpaca” that can operat...
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5518131/ https://www.ncbi.nlm.nih.gov/pubmed/28724409 http://dx.doi.org/10.1186/s12864-017-3927-8 |
_version_ | 1783251431953793024 |
---|---|
author | Miller, Jason R. Zhou, Peng Mudge, Joann Gurtowski, James Lee, Hayan Ramaraj, Thiruvarangan Walenz, Brian P. Liu, Junqi Stupar, Robert M. Denny, Roxanne Song, Li Singh, Namrata Maron, Lyza G. McCouch, Susan R. McCombie, W. Richard Schatz, Michael C. Tiffin, Peter Young, Nevin D. Silverstein, Kevin A. T. |
author_facet | Miller, Jason R. Zhou, Peng Mudge, Joann Gurtowski, James Lee, Hayan Ramaraj, Thiruvarangan Walenz, Brian P. Liu, Junqi Stupar, Robert M. Denny, Roxanne Song, Li Singh, Namrata Maron, Lyza G. McCouch, Susan R. McCombie, W. Richard Schatz, Michael C. Tiffin, Peter Young, Nevin D. Silverstein, Kevin A. T. |
author_sort | Miller, Jason R. |
collection | PubMed |
description | BACKGROUND: Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. METHODS: We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation. RESULTS: Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies. CONCLUSION: Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3927-8) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5518131 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-55181312017-08-16 Hybrid assembly with long and short reads improves discovery of gene family expansions Miller, Jason R. Zhou, Peng Mudge, Joann Gurtowski, James Lee, Hayan Ramaraj, Thiruvarangan Walenz, Brian P. Liu, Junqi Stupar, Robert M. Denny, Roxanne Song, Li Singh, Namrata Maron, Lyza G. McCouch, Susan R. McCombie, W. Richard Schatz, Michael C. Tiffin, Peter Young, Nevin D. Silverstein, Kevin A. T. BMC Genomics Methodology Article BACKGROUND: Long-read and short-read sequencing technologies offer competing advantages for eukaryotic genome sequencing projects. Combinations of both may be appropriate for surveys of within-species genomic variation. METHODS: We developed a hybrid assembly pipeline called “Alpaca” that can operate on 20X long-read coverage plus about 50X short-insert and 50X long-insert short-read coverage. To preclude collapse of tandem repeats, Alpaca relies on base-call-corrected long reads for contig formation. RESULTS: Compared to two other assembly protocols, Alpaca demonstrated the most reference agreement and repeat capture on the rice genome. On three accessions of the model legume Medicago truncatula, Alpaca generated the most agreement to a conspecific reference and predicted tandemly repeated genes absent from the other assemblies. CONCLUSION: Our results suggest Alpaca is a useful tool for investigating structural and copy number variation within de novo assemblies of sampled populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-3927-8) contains supplementary material, which is available to authorized users. BioMed Central 2017-07-19 /pmc/articles/PMC5518131/ /pubmed/28724409 http://dx.doi.org/10.1186/s12864-017-3927-8 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Miller, Jason R. Zhou, Peng Mudge, Joann Gurtowski, James Lee, Hayan Ramaraj, Thiruvarangan Walenz, Brian P. Liu, Junqi Stupar, Robert M. Denny, Roxanne Song, Li Singh, Namrata Maron, Lyza G. McCouch, Susan R. McCombie, W. Richard Schatz, Michael C. Tiffin, Peter Young, Nevin D. Silverstein, Kevin A. T. Hybrid assembly with long and short reads improves discovery of gene family expansions |
title | Hybrid assembly with long and short reads improves discovery of gene family expansions |
title_full | Hybrid assembly with long and short reads improves discovery of gene family expansions |
title_fullStr | Hybrid assembly with long and short reads improves discovery of gene family expansions |
title_full_unstemmed | Hybrid assembly with long and short reads improves discovery of gene family expansions |
title_short | Hybrid assembly with long and short reads improves discovery of gene family expansions |
title_sort | hybrid assembly with long and short reads improves discovery of gene family expansions |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5518131/ https://www.ncbi.nlm.nih.gov/pubmed/28724409 http://dx.doi.org/10.1186/s12864-017-3927-8 |
work_keys_str_mv | AT millerjasonr hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT zhoupeng hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT mudgejoann hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT gurtowskijames hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT leehayan hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT ramarajthiruvarangan hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT walenzbrianp hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT liujunqi hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT stuparrobertm hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT dennyroxanne hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT songli hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT singhnamrata hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT maronlyzag hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT mccouchsusanr hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT mccombiewrichard hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT schatzmichaelc hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT tiffinpeter hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT youngnevind hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions AT silversteinkevinat hybridassemblywithlongandshortreadsimprovesdiscoveryofgenefamilyexpansions |