Cargando…

General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability

BACKGROUND: Amongst the most commonly used molecular markers for plant phylogenetic studies are the nuclear ribosomal internal transcribed spacers (ITS). Intra-individual variability of these multicopy regions is a very common phenomenon in plants, the causes of which are debated in literature. Phyl...

Descripción completa

Detalles Bibliográficos
Autores principales: Göker, Markus, Grimm, Guido W
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2291458/
https://www.ncbi.nlm.nih.gov/pubmed/18366660
http://dx.doi.org/10.1186/1471-2148-8-86
_version_ 1782152454313345024
author Göker, Markus
Grimm, Guido W
author_facet Göker, Markus
Grimm, Guido W
author_sort Göker, Markus
collection PubMed
description BACKGROUND: Amongst the most commonly used molecular markers for plant phylogenetic studies are the nuclear ribosomal internal transcribed spacers (ITS). Intra-individual variability of these multicopy regions is a very common phenomenon in plants, the causes of which are debated in literature. Phylogenetic reconstruction under these conditions is inherently difficult. Our approach is to consider this problem as a special case of the general biological question of how to infer the characteristics of hosts (represented here by plant individuals) from features of their associates (represented by cloned sequences here). RESULTS: Six general transformation functions are introduced, covering the transformation of associate characters to discrete and continuous host characters, and the transformation of associate distances to host distances. A pure distance-based framework is established in which these transformation functions are applied to ITS sequences collected from the angiosperm genera Acer, Fagus and Zelkova. The formulae are also applied to allelic data of three different loci obtained from Rosa spp. The functions are validated by (1) phylogeny-independent measures of treelikeness; (2) correlation with independent host characters; (3) visualization using splits graphs and comparison with published data on the test organisms. The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature. High-quality distance matrices are obtained with four of the six transformation formulae. We demonstrate that one of them represents a generalization of the Sørensen coefficient, which is widely applied in ecology. CONCLUSION: Because of their generality, the transformation functions may be applied to a wide range of biological problems that are interpretable in terms of hosts and associates. Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa. These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions.
format Text
id pubmed-2291458
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22914582008-04-10 General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability Göker, Markus Grimm, Guido W BMC Evol Biol Research Article BACKGROUND: Amongst the most commonly used molecular markers for plant phylogenetic studies are the nuclear ribosomal internal transcribed spacers (ITS). Intra-individual variability of these multicopy regions is a very common phenomenon in plants, the causes of which are debated in literature. Phylogenetic reconstruction under these conditions is inherently difficult. Our approach is to consider this problem as a special case of the general biological question of how to infer the characteristics of hosts (represented here by plant individuals) from features of their associates (represented by cloned sequences here). RESULTS: Six general transformation functions are introduced, covering the transformation of associate characters to discrete and continuous host characters, and the transformation of associate distances to host distances. A pure distance-based framework is established in which these transformation functions are applied to ITS sequences collected from the angiosperm genera Acer, Fagus and Zelkova. The formulae are also applied to allelic data of three different loci obtained from Rosa spp. The functions are validated by (1) phylogeny-independent measures of treelikeness; (2) correlation with independent host characters; (3) visualization using splits graphs and comparison with published data on the test organisms. The results agree well with these three measures and the datasets examined as well as with the theoretical predictions and previous results in the literature. High-quality distance matrices are obtained with four of the six transformation formulae. We demonstrate that one of them represents a generalization of the Sørensen coefficient, which is widely applied in ecology. CONCLUSION: Because of their generality, the transformation functions may be applied to a wide range of biological problems that are interpretable in terms of hosts and associates. Regarding cloned sequences, the formulae have a high potential to accurately reflect evolutionary relationships within angiosperm genera, and to identify hybrids and ancestral taxa. These results corroborate earlier ones which showed that treelikeness measures are a valuable tool in comparative studies of biological distance functions. BioMed Central 2008-03-18 /pmc/articles/PMC2291458/ /pubmed/18366660 http://dx.doi.org/10.1186/1471-2148-8-86 Text en Copyright ©2008 Göker and Grimm; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Göker, Markus
Grimm, Guido W
General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title_full General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title_fullStr General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title_full_unstemmed General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title_short General functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
title_sort general functions to transform associate data to host data, and their use in phylogenetic inference from sequences with intra-individual variability
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2291458/
https://www.ncbi.nlm.nih.gov/pubmed/18366660
http://dx.doi.org/10.1186/1471-2148-8-86
work_keys_str_mv AT gokermarkus generalfunctionstotransformassociatedatatohostdataandtheiruseinphylogeneticinferencefromsequenceswithintraindividualvariability
AT grimmguidow generalfunctionstotransformassociatedatatohostdataandtheiruseinphylogeneticinferencefromsequenceswithintraindividualvariability