Cargando…

Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)

BACKGROUND: The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focu...

Descripción completa

Detalles Bibliográficos
Autores principales: Nock, Catherine J., Baten, Abdul, Barkla, Bronwyn J., Furtado, Agnelo, Henry, Robert J., King, Graham J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5114810/
https://www.ncbi.nlm.nih.gov/pubmed/27855648
http://dx.doi.org/10.1186/s12864-016-3272-3
_version_ 1782468411696087040
author Nock, Catherine J.
Baten, Abdul
Barkla, Bronwyn J.
Furtado, Agnelo
Henry, Robert J.
King, Graham J.
author_facet Nock, Catherine J.
Baten, Abdul
Barkla, Bronwyn J.
Furtado, Agnelo
Henry, Robert J.
King, Graham J.
author_sort Nock, Catherine J.
collection PubMed
description BACKGROUND: The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. RESULTS: Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. CONCLUSIONS: The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop improvement. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3272-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5114810
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-51148102016-11-25 Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae) Nock, Catherine J. Baten, Abdul Barkla, Bronwyn J. Furtado, Agnelo Henry, Robert J. King, Graham J. BMC Genomics Research Article BACKGROUND: The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. RESULTS: Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. CONCLUSIONS: The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop improvement. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3272-3) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-17 /pmc/articles/PMC5114810/ /pubmed/27855648 http://dx.doi.org/10.1186/s12864-016-3272-3 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Nock, Catherine J.
Baten, Abdul
Barkla, Bronwyn J.
Furtado, Agnelo
Henry, Robert J.
King, Graham J.
Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title_full Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title_fullStr Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title_full_unstemmed Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title_short Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)
title_sort genome and transcriptome sequencing characterises the gene space of macadamia integrifolia (proteaceae)
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5114810/
https://www.ncbi.nlm.nih.gov/pubmed/27855648
http://dx.doi.org/10.1186/s12864-016-3272-3
work_keys_str_mv AT nockcatherinej genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae
AT batenabdul genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae
AT barklabronwynj genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae
AT furtadoagnelo genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae
AT henryrobertj genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae
AT kinggrahamj genomeandtranscriptomesequencingcharacterisesthegenespaceofmacadamiaintegrifoliaproteaceae