Cargando…

First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)

Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissu...

Descripción completa

Detalles Bibliográficos
Autores principales: Sork, Victoria L., Fitz-Gibbon, Sorel T., Puiu, Daniela, Crepeau, Marc, Gugger, Paul F., Sherman, Rachel, Stevens, Kristian, Langley, Charles H., Pellegrini, Matteo, Salzberg, Steven L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100847/
https://www.ncbi.nlm.nih.gov/pubmed/27621377
http://dx.doi.org/10.1534/g3.116.030411
_version_ 1782466201048317952
author Sork, Victoria L.
Fitz-Gibbon, Sorel T.
Puiu, Daniela
Crepeau, Marc
Gugger, Paul F.
Sherman, Rachel
Stevens, Kristian
Langley, Charles H.
Pellegrini, Matteo
Salzberg, Steven L.
author_facet Sork, Victoria L.
Fitz-Gibbon, Sorel T.
Puiu, Daniela
Crepeau, Marc
Gugger, Paul F.
Sherman, Rachel
Stevens, Kristian
Langley, Charles H.
Pellegrini, Matteo
Salzberg, Steven L.
author_sort Sork, Victoria L.
collection PubMed
description Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. Our assembly includes a nuclear genome and a complete chloroplast genome, along with annotation of encoded genes. The assembly contains 94,394 scaffolds, totaling 1.17 Gb with 18,512 scaffolds of length 2 kb or longer, with a total length of 1.15 Gb, and a N50 scaffold size of 278,077 kb. The k-mer histograms indicate an diploid genome size of ∼720–730 Mb, which is smaller than the total length due to high heterozygosity, estimated at 1.25%. A comparison with a recently published European oak (Q. robur) nuclear sequence indicates 93% similarity. The Q. lobata chloroplast genome has 99% identity with another North American oak, Q. rubra. Preliminary annotation yielded an estimate of 61,773 predicted protein-coding genes, of which 71% had similarity to known protein domains. We searched 956 Benchmarking Universal Single-Copy Orthologs, and found 863 complete orthologs, of which 450 were present in > 1 copy. We also examined an earlier version (v0.5) where duplicate haplotypes were removed to discover variants. These additional sources indicate that the predicted gene count in Version 1.0 is overestimated by 37–52%. Nonetheless, this first draft valley oak genome assembly represents a high-quality, well-annotated genome that provides a tool for forest restoration and management practices.
format Online
Article
Text
id pubmed-5100847
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-51008472016-11-09 First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) Sork, Victoria L. Fitz-Gibbon, Sorel T. Puiu, Daniela Crepeau, Marc Gugger, Paul F. Sherman, Rachel Stevens, Kristian Langley, Charles H. Pellegrini, Matteo Salzberg, Steven L. G3 (Bethesda) Genomic Selection Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. Our assembly includes a nuclear genome and a complete chloroplast genome, along with annotation of encoded genes. The assembly contains 94,394 scaffolds, totaling 1.17 Gb with 18,512 scaffolds of length 2 kb or longer, with a total length of 1.15 Gb, and a N50 scaffold size of 278,077 kb. The k-mer histograms indicate an diploid genome size of ∼720–730 Mb, which is smaller than the total length due to high heterozygosity, estimated at 1.25%. A comparison with a recently published European oak (Q. robur) nuclear sequence indicates 93% similarity. The Q. lobata chloroplast genome has 99% identity with another North American oak, Q. rubra. Preliminary annotation yielded an estimate of 61,773 predicted protein-coding genes, of which 71% had similarity to known protein domains. We searched 956 Benchmarking Universal Single-Copy Orthologs, and found 863 complete orthologs, of which 450 were present in > 1 copy. We also examined an earlier version (v0.5) where duplicate haplotypes were removed to discover variants. These additional sources indicate that the predicted gene count in Version 1.0 is overestimated by 37–52%. Nonetheless, this first draft valley oak genome assembly represents a high-quality, well-annotated genome that provides a tool for forest restoration and management practices. Genetics Society of America 2016-09-12 /pmc/articles/PMC5100847/ /pubmed/27621377 http://dx.doi.org/10.1534/g3.116.030411 Text en Copyright © 2016 Sork et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomic Selection
Sork, Victoria L.
Fitz-Gibbon, Sorel T.
Puiu, Daniela
Crepeau, Marc
Gugger, Paul F.
Sherman, Rachel
Stevens, Kristian
Langley, Charles H.
Pellegrini, Matteo
Salzberg, Steven L.
First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title_full First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title_fullStr First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title_full_unstemmed First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title_short First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
title_sort first draft assembly and annotation of the genome of a california endemic oak quercus lobata née (fagaceae)
topic Genomic Selection
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100847/
https://www.ncbi.nlm.nih.gov/pubmed/27621377
http://dx.doi.org/10.1534/g3.116.030411
work_keys_str_mv AT sorkvictorial firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT fitzgibbonsorelt firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT puiudaniela firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT crepeaumarc firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT guggerpaulf firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT shermanrachel firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT stevenskristian firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT langleycharlesh firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT pellegrinimatteo firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae
AT salzbergstevenl firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae