Cargando…

Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica

BACKGROUND: The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, w...

Descripción completa

Detalles Bibliográficos
Autores principales: Schatz, Michael C, Maron, Lyza G, Stein, Joshua C, Wences, Alejandro Hernandez, Gurtowski, James, Biggers, Eric, Lee, Hayan, Kramer, Melissa, Antoniou, Eric, Ghiban, Elena, Wright, Mark H, Chia, Jer-ming, Ware, Doreen, McCouch, Susan R, McCombie, W Richard
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4268812/
https://www.ncbi.nlm.nih.gov/pubmed/25468217
http://dx.doi.org/10.1186/s13059-014-0506-z
_version_ 1782349293282131968
author Schatz, Michael C
Maron, Lyza G
Stein, Joshua C
Wences, Alejandro Hernandez
Gurtowski, James
Biggers, Eric
Lee, Hayan
Kramer, Melissa
Antoniou, Eric
Ghiban, Elena
Wright, Mark H
Chia, Jer-ming
Ware, Doreen
McCouch, Susan R
McCombie, W Richard
author_facet Schatz, Michael C
Maron, Lyza G
Stein, Joshua C
Wences, Alejandro Hernandez
Gurtowski, James
Biggers, Eric
Lee, Hayan
Kramer, Melissa
Antoniou, Eric
Ghiban, Elena
Wright, Mark H
Chia, Jer-ming
Ware, Doreen
McCouch, Susan R
McCombie, W Richard
author_sort Schatz, Michael C
collection PubMed
description BACKGROUND: The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, whole genome resequencing data are typically aligned to an established reference sequence. However, when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. RESULTS: Here, we use rice as a model to demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared using whole genome alignment to provide an unbiased assessment. Using this approach, we are able to accurately assess the ‘pan-genome’ of three divergent rice varieties and document several megabases of each genome absent in the other two. CONCLUSIONS: Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard reference-mapping approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, including the S5 hybrid sterility locus, the Sub1 submergence tolerance locus, the LRK gene cluster associated with improved yield, and the Pup1 cluster associated with phosphorus deficiency, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0506-z) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4268812
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42688122014-12-17 Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica Schatz, Michael C Maron, Lyza G Stein, Joshua C Wences, Alejandro Hernandez Gurtowski, James Biggers, Eric Lee, Hayan Kramer, Melissa Antoniou, Eric Ghiban, Elena Wright, Mark H Chia, Jer-ming Ware, Doreen McCouch, Susan R McCombie, W Richard Genome Biol Research BACKGROUND: The use of high throughput genome-sequencing technologies has uncovered a large extent of structural variation in eukaryotic genomes that makes important contributions to genomic diversity and phenotypic variation. When the genomes of different strains of a given organism are compared, whole genome resequencing data are typically aligned to an established reference sequence. However, when the reference differs in significant structural ways from the individuals under study, the analysis is often incomplete or inaccurate. RESULTS: Here, we use rice as a model to demonstrate how improvements in sequencing and assembly technology allow rapid and inexpensive de novo assembly of next generation sequence data into high-quality assemblies that can be directly compared using whole genome alignment to provide an unbiased assessment. Using this approach, we are able to accurately assess the ‘pan-genome’ of three divergent rice varieties and document several megabases of each genome absent in the other two. CONCLUSIONS: Many of the genome-specific loci are annotated to contain genes, reflecting the potential for new biological properties that would be missed by standard reference-mapping approaches. We further provide a detailed analysis of several loci associated with agriculturally important traits, including the S5 hybrid sterility locus, the Sub1 submergence tolerance locus, the LRK gene cluster associated with improved yield, and the Pup1 cluster associated with phosphorus deficiency, illustrating the utility of our approach for biological discovery. All of the data and software are openly available to support further breeding and functional studies of rice and other species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13059-014-0506-z) contains supplementary material, which is available to authorized users. BioMed Central 2014-12-03 2014 /pmc/articles/PMC4268812/ /pubmed/25468217 http://dx.doi.org/10.1186/s13059-014-0506-z Text en © Schatz et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Schatz, Michael C
Maron, Lyza G
Stein, Joshua C
Wences, Alejandro Hernandez
Gurtowski, James
Biggers, Eric
Lee, Hayan
Kramer, Melissa
Antoniou, Eric
Ghiban, Elena
Wright, Mark H
Chia, Jer-ming
Ware, Doreen
McCouch, Susan R
McCombie, W Richard
Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title_full Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title_fullStr Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title_full_unstemmed Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title_short Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica
title_sort whole genome de novo assemblies of three divergent strains of rice, oryza sativa, document novel gene space of aus and indica
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4268812/
https://www.ncbi.nlm.nih.gov/pubmed/25468217
http://dx.doi.org/10.1186/s13059-014-0506-z
work_keys_str_mv AT schatzmichaelc wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT maronlyzag wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT steinjoshuac wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT wencesalejandrohernandez wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT gurtowskijames wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT biggerseric wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT leehayan wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT kramermelissa wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT antonioueric wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT ghibanelena wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT wrightmarkh wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT chiajerming wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT waredoreen wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT mccouchsusanr wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica
AT mccombiewrichard wholegenomedenovoassembliesofthreedivergentstrainsofriceoryzasativadocumentnovelgenespaceofausandindica