Cargando…

EAPhy: A Flexible Tool for High-throughput Quality Filtering of Exon-alignments and Data Processing for Phylogenetic Methods

Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally cond...

Descripción completa

Detalles Bibliográficos
Autor principal: Blom, Mozes P K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4542277/
https://www.ncbi.nlm.nih.gov/pubmed/26331095
http://dx.doi.org/10.1371/currents.tol.75134257bd389c04bc1d26d42aa9089f
Descripción
Sumario:Recently developed molecular methods enable geneticists to target and sequence thousands of orthologous loci and infer evolutionary relationships across the tree of life. Large numbers of genetic markers benefit species tree inference but visual inspection of alignment quality, as traditionally conducted, is challenging with thousands of loci. Furthermore, due to the impracticality of repeated visual inspection with alternative filtering criteria, the potential consequences of using datasets with different degrees of missing data remain nominally explored in most empirical phylogenomic studies. In this short communication, I describe a flexible high-throughput pipeline designed to assess alignment quality and filter exonic sequence data for subsequent inference. The stringency criteria for alignment quality and missing data can be adapted based on the expected level of sequence divergence. Each alignment is automatically evaluated based on the stringency criteria specified, significantly reducing the number of alignments that require visual inspection. By developing a rapid method for alignment filtering and quality assessment, the consistency of phylogenetic estimation based on exonic sequence alignments can be further explored across distinct inference methods, while accounting for different degrees of missing data.