Cargando…

Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants

BACKGROUND: Transcriptome analysis is increasingly being used to study the evolutionary origins and ecology of non-model plants. One issue for both transcriptome assembly and differential gene expression analyses is the common occurrence in plants of hybridisation and whole genome duplication (WGD)...

Descripción completa

Detalles Bibliográficos
Autores principales: Gruenheit, Nicole, Deusch, Oliver, Esser, Christian, Becker, Matthias, Voelckel, Claudia, Lockhart, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3378427/
https://www.ncbi.nlm.nih.gov/pubmed/22417298
http://dx.doi.org/10.1186/1471-2164-13-92
_version_ 1782236033282211840
author Gruenheit, Nicole
Deusch, Oliver
Esser, Christian
Becker, Matthias
Voelckel, Claudia
Lockhart, Peter
author_facet Gruenheit, Nicole
Deusch, Oliver
Esser, Christian
Becker, Matthias
Voelckel, Claudia
Lockhart, Peter
author_sort Gruenheit, Nicole
collection PubMed
description BACKGROUND: Transcriptome analysis is increasingly being used to study the evolutionary origins and ecology of non-model plants. One issue for both transcriptome assembly and differential gene expression analyses is the common occurrence in plants of hybridisation and whole genome duplication (WGD) and hybridization resulting in allopolyploidy. The divergence of duplicated genes following WGD creates near identical homeologues that can be problematic for de novo assembly and also reference based assembly protocols that use short reads (35 - 100 bp). RESULTS: Here we report a successful strategy for the assembly of two transcriptomes made using 75 bp Illumina reads from Pachycladon fastigiatum and Pachycladon cheesemanii. Both are allopolyploid plant species (2n = 20) that originated in the New Zealand Alps about 0.8 million years ago. In a systematic analysis of 19 different coverage cutoffs and 20 different k-mer sizes we showed that i) none of the genes could be assembled across all of the parameter space ii) assembly of each gene required an optimal set of parameter values and iii) these parameter values could be explained in part by different gene expression levels and different degrees of similarity between genes. CONCLUSIONS: To obtain optimal transcriptome assemblies for allopolyploid plants, k-mer size and k-mer coverage need to be considered simultaneously across a broad parameter space. This is important for assembling a maximum number of full length ESTs and for avoiding chimeric assemblies of homeologous and paralogous gene copies.
format Online
Article
Text
id pubmed-3378427
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33784272012-06-20 Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants Gruenheit, Nicole Deusch, Oliver Esser, Christian Becker, Matthias Voelckel, Claudia Lockhart, Peter BMC Genomics Research Article BACKGROUND: Transcriptome analysis is increasingly being used to study the evolutionary origins and ecology of non-model plants. One issue for both transcriptome assembly and differential gene expression analyses is the common occurrence in plants of hybridisation and whole genome duplication (WGD) and hybridization resulting in allopolyploidy. The divergence of duplicated genes following WGD creates near identical homeologues that can be problematic for de novo assembly and also reference based assembly protocols that use short reads (35 - 100 bp). RESULTS: Here we report a successful strategy for the assembly of two transcriptomes made using 75 bp Illumina reads from Pachycladon fastigiatum and Pachycladon cheesemanii. Both are allopolyploid plant species (2n = 20) that originated in the New Zealand Alps about 0.8 million years ago. In a systematic analysis of 19 different coverage cutoffs and 20 different k-mer sizes we showed that i) none of the genes could be assembled across all of the parameter space ii) assembly of each gene required an optimal set of parameter values and iii) these parameter values could be explained in part by different gene expression levels and different degrees of similarity between genes. CONCLUSIONS: To obtain optimal transcriptome assemblies for allopolyploid plants, k-mer size and k-mer coverage need to be considered simultaneously across a broad parameter space. This is important for assembling a maximum number of full length ESTs and for avoiding chimeric assemblies of homeologous and paralogous gene copies. BioMed Central 2012-03-14 /pmc/articles/PMC3378427/ /pubmed/22417298 http://dx.doi.org/10.1186/1471-2164-13-92 Text en Copyright ©2012 Gruenheit et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Gruenheit, Nicole
Deusch, Oliver
Esser, Christian
Becker, Matthias
Voelckel, Claudia
Lockhart, Peter
Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title_full Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title_fullStr Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title_full_unstemmed Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title_short Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
title_sort cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3378427/
https://www.ncbi.nlm.nih.gov/pubmed/22417298
http://dx.doi.org/10.1186/1471-2164-13-92
work_keys_str_mv AT gruenheitnicole cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants
AT deuscholiver cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants
AT esserchristian cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants
AT beckermatthias cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants
AT voelckelclaudia cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants
AT lockhartpeter cutoffsandkmersimplicationsfromatranscriptomestudyinallopolyploidplants