Cargando…

Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?

This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from a...

Descripción completa

Detalles Bibliográficos
Autores principales: Cho, Soowon, Zwick, Andreas, Regier, Jerome C., Mitter, Charles, Cummings, Michael P., Yao, Jianxiu, Du, Zaile, Zhao, Hong, Kawahara, Akito Y., Weller, Susan, Davis, Donald R., Baixeras, Joaquin, Brown, John W., Parr, Cynthia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3193767/
https://www.ncbi.nlm.nih.gov/pubmed/21840842
http://dx.doi.org/10.1093/sysbio/syr079
_version_ 1782213882004111360
author Cho, Soowon
Zwick, Andreas
Regier, Jerome C.
Mitter, Charles
Cummings, Michael P.
Yao, Jianxiu
Du, Zaile
Zhao, Hong
Kawahara, Akito Y.
Weller, Susan
Davis, Donald R.
Baixeras, Joaquin
Brown, John W.
Parr, Cynthia
author_facet Cho, Soowon
Zwick, Andreas
Regier, Jerome C.
Mitter, Charles
Cummings, Michael P.
Yao, Jianxiu
Du, Zaile
Zhao, Hong
Kawahara, Akito Y.
Weller, Susan
Davis, Donald R.
Baixeras, Joaquin
Brown, John W.
Parr, Cynthia
author_sort Cho, Soowon
collection PubMed
description This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a third (41) of the taxa. The resulting partially augmented data matrix (45% intentionally missing data) consistently increased bootstrap support for groupings previously identified in the five-gene (nearly) complete matrix, while introducing no contradictory groupings of the kind that missing data have been predicted to produce. Our results add to growing evidence that data sets differing substantially in gene and taxon sampling can often be safely and profitably combined. The strongest overall support for nodes above the family level came from including all nucleotide changes, while partitioning sites into sets undergoing mostly nonsynonymous versus mostly synonymous change. In contrast, support for the deepest node for which any persuasive molecular evidence has yet emerged (78–85% bootstrap) was weak or nonexistent unless synonymous change was entirely excluded, a result plausibly attributed to compositional heterogeneity. This node (Gelechioidea + Apoditrysia), tentatively proposed by previous authors on the basis of four morphological synapomorphies, is the first major subset of ditrysian superfamilies to receive strong statistical support in any phylogenetic study. A “more-genes-only” data set (41 taxa×26 genes) also gave strong signal for a second deep grouping (Macrolepidoptera) that was obscured, but not strongly contradicted, in more taxon-rich analyses.
format Online
Article
Text
id pubmed-3193767
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31937672011-10-17 Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)? Cho, Soowon Zwick, Andreas Regier, Jerome C. Mitter, Charles Cummings, Michael P. Yao, Jianxiu Du, Zaile Zhao, Hong Kawahara, Akito Y. Weller, Susan Davis, Donald R. Baixeras, Joaquin Brown, John W. Parr, Cynthia Syst Biol Regular Articles This paper addresses the question of whether one can economically improve the robustness of a molecular phylogeny estimate by increasing gene sampling in only a subset of taxa, without having the analysis invalidated by artifacts arising from large blocks of missing data. Our case study stems from an ongoing effort to resolve poorly understood deeper relationships in the large clade Ditrysia ( > 150,000 species) of the insect order Lepidoptera (butterflies and moths). Seeking to remedy the overall weak support for deeper divergences in an initial study based on five nuclear genes (6.6 kb) in 123 exemplars, we nearly tripled the total gene sample (to 26 genes, 18.4 kb) but only in a third (41) of the taxa. The resulting partially augmented data matrix (45% intentionally missing data) consistently increased bootstrap support for groupings previously identified in the five-gene (nearly) complete matrix, while introducing no contradictory groupings of the kind that missing data have been predicted to produce. Our results add to growing evidence that data sets differing substantially in gene and taxon sampling can often be safely and profitably combined. The strongest overall support for nodes above the family level came from including all nucleotide changes, while partitioning sites into sets undergoing mostly nonsynonymous versus mostly synonymous change. In contrast, support for the deepest node for which any persuasive molecular evidence has yet emerged (78–85% bootstrap) was weak or nonexistent unless synonymous change was entirely excluded, a result plausibly attributed to compositional heterogeneity. This node (Gelechioidea + Apoditrysia), tentatively proposed by previous authors on the basis of four morphological synapomorphies, is the first major subset of ditrysian superfamilies to receive strong statistical support in any phylogenetic study. A “more-genes-only” data set (41 taxa×26 genes) also gave strong signal for a second deep grouping (Macrolepidoptera) that was obscured, but not strongly contradicted, in more taxon-rich analyses. Oxford University Press 2011-12 2011-08-11 /pmc/articles/PMC3193767/ /pubmed/21840842 http://dx.doi.org/10.1093/sysbio/syr079 Text en © The Author(s) 2011. Published by Oxford University Press on behalf of the Society of Systematic Biologists. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Articles
Cho, Soowon
Zwick, Andreas
Regier, Jerome C.
Mitter, Charles
Cummings, Michael P.
Yao, Jianxiu
Du, Zaile
Zhao, Hong
Kawahara, Akito Y.
Weller, Susan
Davis, Donald R.
Baixeras, Joaquin
Brown, John W.
Parr, Cynthia
Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title_full Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title_fullStr Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title_full_unstemmed Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title_short Can Deliberately Incomplete Gene Sample Augmentation Improve a Phylogeny Estimate for the Advanced Moths and Butterflies (Hexapoda: Lepidoptera)?
title_sort can deliberately incomplete gene sample augmentation improve a phylogeny estimate for the advanced moths and butterflies (hexapoda: lepidoptera)?
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3193767/
https://www.ncbi.nlm.nih.gov/pubmed/21840842
http://dx.doi.org/10.1093/sysbio/syr079
work_keys_str_mv AT chosoowon candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT zwickandreas candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT regierjeromec candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT mittercharles candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT cummingsmichaelp candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT yaojianxiu candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT duzaile candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT zhaohong candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT kawaharaakitoy candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT wellersusan candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT davisdonaldr candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT baixerasjoaquin candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT brownjohnw candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera
AT parrcynthia candeliberatelyincompletegenesampleaugmentationimproveaphylogenyestimatefortheadvancedmothsandbutterflieshexapodalepidoptera