Cargando…

Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies

Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of...

Descripción completa

Detalles Bibliográficos
Autores principales: Leaché, Adam D., Banbury, Barbara L., Felsenstein, Joseph, de Oca, Adrián nieto-Montes, Stamatakis, Alexandros
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4604835/
https://www.ncbi.nlm.nih.gov/pubmed/26227865
http://dx.doi.org/10.1093/sysbio/syv053
_version_ 1782395118107492352
author Leaché, Adam D.
Banbury, Barbara L.
Felsenstein, Joseph
de Oca, Adrián nieto-Montes
Stamatakis, Alexandros
author_facet Leaché, Adam D.
Banbury, Barbara L.
Felsenstein, Joseph
de Oca, Adrián nieto-Montes
Stamatakis, Alexandros
author_sort Leaché, Adam D.
collection PubMed
description Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the presence of missing data. Phylogenetic analysis of RAD loci requires careful attention to model assumptions, especially if downstream analyses depend on branch lengths.
format Online
Article
Text
id pubmed-4604835
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-46048352015-10-19 Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies Leaché, Adam D. Banbury, Barbara L. Felsenstein, Joseph de Oca, Adrián nieto-Montes Stamatakis, Alexandros Syst Biol Regular Articles Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the presence of missing data. Phylogenetic analysis of RAD loci requires careful attention to model assumptions, especially if downstream analyses depend on branch lengths. Oxford University Press 2015-11 2015-07-29 /pmc/articles/PMC4604835/ /pubmed/26227865 http://dx.doi.org/10.1093/sysbio/syv053 Text en © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Regular Articles
Leaché, Adam D.
Banbury, Barbara L.
Felsenstein, Joseph
de Oca, Adrián nieto-Montes
Stamatakis, Alexandros
Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title_full Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title_fullStr Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title_full_unstemmed Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title_short Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies
title_sort short tree, long tree, right tree, wrong tree: new acquisition bias corrections for inferring snp phylogenies
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4604835/
https://www.ncbi.nlm.nih.gov/pubmed/26227865
http://dx.doi.org/10.1093/sysbio/syv053
work_keys_str_mv AT leacheadamd shorttreelongtreerighttreewrongtreenewacquisitionbiascorrectionsforinferringsnpphylogenies
AT banburybarbaral shorttreelongtreerighttreewrongtreenewacquisitionbiascorrectionsforinferringsnpphylogenies
AT felsensteinjoseph shorttreelongtreerighttreewrongtreenewacquisitionbiascorrectionsforinferringsnpphylogenies
AT deocaadriannietomontes shorttreelongtreerighttreewrongtreenewacquisitionbiascorrectionsforinferringsnpphylogenies
AT stamatakisalexandros shorttreelongtreerighttreewrongtreenewacquisitionbiascorrectionsforinferringsnpphylogenies