Cargando…
An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs w...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437942/ https://www.ncbi.nlm.nih.gov/pubmed/28369353 http://dx.doi.org/10.1093/gigascience/giw016 |
_version_ | 1783237677266501632 |
---|---|
author | Zimin, Aleksey V. Stevens, Kristian A. Crepeau, Marc W. Puiu, Daniela Wegrzyn, Jill L. Yorke, James A. Langley, Charles H. Neale, David B. Salzberg, Steven L. |
author_facet | Zimin, Aleksey V. Stevens, Kristian A. Crepeau, Marc W. Puiu, Daniela Wegrzyn, Jill L. Yorke, James A. Langley, Charles H. Neale, David B. Salzberg, Steven L. |
author_sort | Zimin, Aleksey V. |
collection | PubMed |
description | The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly. |
format | Online Article Text |
id | pubmed-5437942 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-54379422017-06-14 An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing Zimin, Aleksey V. Stevens, Kristian A. Crepeau, Marc W. Puiu, Daniela Wegrzyn, Jill L. Yorke, James A. Langley, Charles H. Neale, David B. Salzberg, Steven L. Gigascience Data Note The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly. Oxford University Press 2017-02-15 /pmc/articles/PMC5437942/ /pubmed/28369353 http://dx.doi.org/10.1093/gigascience/giw016 Text en © The Author 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Data Note Zimin, Aleksey V. Stevens, Kristian A. Crepeau, Marc W. Puiu, Daniela Wegrzyn, Jill L. Yorke, James A. Langley, Charles H. Neale, David B. Salzberg, Steven L. An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title | An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title_full | An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title_fullStr | An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title_full_unstemmed | An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title_short | An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
title_sort | improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing |
topic | Data Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5437942/ https://www.ncbi.nlm.nih.gov/pubmed/28369353 http://dx.doi.org/10.1093/gigascience/giw016 |
work_keys_str_mv | AT ziminalekseyv animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT stevenskristiana animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT crepeaumarcw animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT puiudaniela animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT wegrzynjilll animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT yorkejamesa animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT langleycharlesh animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT nealedavidb animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT salzbergstevenl animprovedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT ziminalekseyv improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT stevenskristiana improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT crepeaumarcw improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT puiudaniela improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT wegrzynjilll improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT yorkejamesa improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT langleycharlesh improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT nealedavidb improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing AT salzbergstevenl improvedassemblyoftheloblollypinemegagenomeusinglongreadsinglemoleculesequencing |