Cargando…
First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae)
Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissu...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Genetics Society of America
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100847/ https://www.ncbi.nlm.nih.gov/pubmed/27621377 http://dx.doi.org/10.1534/g3.116.030411 |
_version_ | 1782466201048317952 |
---|---|
author | Sork, Victoria L. Fitz-Gibbon, Sorel T. Puiu, Daniela Crepeau, Marc Gugger, Paul F. Sherman, Rachel Stevens, Kristian Langley, Charles H. Pellegrini, Matteo Salzberg, Steven L. |
author_facet | Sork, Victoria L. Fitz-Gibbon, Sorel T. Puiu, Daniela Crepeau, Marc Gugger, Paul F. Sherman, Rachel Stevens, Kristian Langley, Charles H. Pellegrini, Matteo Salzberg, Steven L. |
author_sort | Sork, Victoria L. |
collection | PubMed |
description | Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. Our assembly includes a nuclear genome and a complete chloroplast genome, along with annotation of encoded genes. The assembly contains 94,394 scaffolds, totaling 1.17 Gb with 18,512 scaffolds of length 2 kb or longer, with a total length of 1.15 Gb, and a N50 scaffold size of 278,077 kb. The k-mer histograms indicate an diploid genome size of ∼720–730 Mb, which is smaller than the total length due to high heterozygosity, estimated at 1.25%. A comparison with a recently published European oak (Q. robur) nuclear sequence indicates 93% similarity. The Q. lobata chloroplast genome has 99% identity with another North American oak, Q. rubra. Preliminary annotation yielded an estimate of 61,773 predicted protein-coding genes, of which 71% had similarity to known protein domains. We searched 956 Benchmarking Universal Single-Copy Orthologs, and found 863 complete orthologs, of which 450 were present in > 1 copy. We also examined an earlier version (v0.5) where duplicate haplotypes were removed to discover variants. These additional sources indicate that the predicted gene count in Version 1.0 is overestimated by 37–52%. Nonetheless, this first draft valley oak genome assembly represents a high-quality, well-annotated genome that provides a tool for forest restoration and management practices. |
format | Online Article Text |
id | pubmed-5100847 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Genetics Society of America |
record_format | MEDLINE/PubMed |
spelling | pubmed-51008472016-11-09 First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) Sork, Victoria L. Fitz-Gibbon, Sorel T. Puiu, Daniela Crepeau, Marc Gugger, Paul F. Sherman, Rachel Stevens, Kristian Langley, Charles H. Pellegrini, Matteo Salzberg, Steven L. G3 (Bethesda) Genomic Selection Oak represents a valuable natural resource across Northern Hemisphere ecosystems, attracting a large research community studying its genetics, ecology, conservation, and management. Here we introduce a draft genome assembly of valley oak (Quercus lobata) using Illumina sequencing of adult leaf tissue of a tree found in an accessible, well-studied, natural southern California population. Our assembly includes a nuclear genome and a complete chloroplast genome, along with annotation of encoded genes. The assembly contains 94,394 scaffolds, totaling 1.17 Gb with 18,512 scaffolds of length 2 kb or longer, with a total length of 1.15 Gb, and a N50 scaffold size of 278,077 kb. The k-mer histograms indicate an diploid genome size of ∼720–730 Mb, which is smaller than the total length due to high heterozygosity, estimated at 1.25%. A comparison with a recently published European oak (Q. robur) nuclear sequence indicates 93% similarity. The Q. lobata chloroplast genome has 99% identity with another North American oak, Q. rubra. Preliminary annotation yielded an estimate of 61,773 predicted protein-coding genes, of which 71% had similarity to known protein domains. We searched 956 Benchmarking Universal Single-Copy Orthologs, and found 863 complete orthologs, of which 450 were present in > 1 copy. We also examined an earlier version (v0.5) where duplicate haplotypes were removed to discover variants. These additional sources indicate that the predicted gene count in Version 1.0 is overestimated by 37–52%. Nonetheless, this first draft valley oak genome assembly represents a high-quality, well-annotated genome that provides a tool for forest restoration and management practices. Genetics Society of America 2016-09-12 /pmc/articles/PMC5100847/ /pubmed/27621377 http://dx.doi.org/10.1534/g3.116.030411 Text en Copyright © 2016 Sork et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Genomic Selection Sork, Victoria L. Fitz-Gibbon, Sorel T. Puiu, Daniela Crepeau, Marc Gugger, Paul F. Sherman, Rachel Stevens, Kristian Langley, Charles H. Pellegrini, Matteo Salzberg, Steven L. First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title | First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title_full | First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title_fullStr | First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title_full_unstemmed | First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title_short | First Draft Assembly and Annotation of the Genome of a California Endemic Oak Quercus lobata Née (Fagaceae) |
title_sort | first draft assembly and annotation of the genome of a california endemic oak quercus lobata née (fagaceae) |
topic | Genomic Selection |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100847/ https://www.ncbi.nlm.nih.gov/pubmed/27621377 http://dx.doi.org/10.1534/g3.116.030411 |
work_keys_str_mv | AT sorkvictorial firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT fitzgibbonsorelt firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT puiudaniela firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT crepeaumarc firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT guggerpaulf firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT shermanrachel firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT stevenskristian firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT langleycharlesh firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT pellegrinimatteo firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae AT salzbergstevenl firstdraftassemblyandannotationofthegenomeofacaliforniaendemicoakquercuslobataneefagaceae |