Cargando…

Split-inducing indels in phylogenomic analysis

BACKGROUND: Most phylogenetic studies using molecular data treat gaps in multiple sequence alignments as missing data or even completely exclude alignment columns that contain gaps. RESULTS: Here we show that gap patterns in large-scale, genome-wide alignments are themselves phylogenetically informa...

Descripción completa

Detalles Bibliográficos
Autores principales: Donath, Alexander, Stadler, Peter F.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6047143/
https://www.ncbi.nlm.nih.gov/pubmed/30026791
http://dx.doi.org/10.1186/s13015-018-0130-7
_version_ 1783339906628583424
author Donath, Alexander
Stadler, Peter F.
author_facet Donath, Alexander
Stadler, Peter F.
author_sort Donath, Alexander
collection PubMed
description BACKGROUND: Most phylogenetic studies using molecular data treat gaps in multiple sequence alignments as missing data or even completely exclude alignment columns that contain gaps. RESULTS: Here we show that gap patterns in large-scale, genome-wide alignments are themselves phylogenetically informative and can be used to infer reliable phylogenies provided the gap data are properly filtered to reduce noise introduced by the alignment method. We introduce here the notion of split-inducing indels (splids) that define an approximate bipartition of the taxon set. We show both in simulated data and in case studies on real-life data that splids can be efficiently extracted from phylogenomic data sets. CONCLUSIONS: Suitably processed gap patterns extracted from genome-wide alignment provide a surprisingly clear phylogenetic signal and an allow the inference of accurate phylogenetic trees. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13015-018-0130-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6047143
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60471432018-07-19 Split-inducing indels in phylogenomic analysis Donath, Alexander Stadler, Peter F. Algorithms Mol Biol Research BACKGROUND: Most phylogenetic studies using molecular data treat gaps in multiple sequence alignments as missing data or even completely exclude alignment columns that contain gaps. RESULTS: Here we show that gap patterns in large-scale, genome-wide alignments are themselves phylogenetically informative and can be used to infer reliable phylogenies provided the gap data are properly filtered to reduce noise introduced by the alignment method. We introduce here the notion of split-inducing indels (splids) that define an approximate bipartition of the taxon set. We show both in simulated data and in case studies on real-life data that splids can be efficiently extracted from phylogenomic data sets. CONCLUSIONS: Suitably processed gap patterns extracted from genome-wide alignment provide a surprisingly clear phylogenetic signal and an allow the inference of accurate phylogenetic trees. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13015-018-0130-7) contains supplementary material, which is available to authorized users. BioMed Central 2018-07-16 /pmc/articles/PMC6047143/ /pubmed/30026791 http://dx.doi.org/10.1186/s13015-018-0130-7 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Donath, Alexander
Stadler, Peter F.
Split-inducing indels in phylogenomic analysis
title Split-inducing indels in phylogenomic analysis
title_full Split-inducing indels in phylogenomic analysis
title_fullStr Split-inducing indels in phylogenomic analysis
title_full_unstemmed Split-inducing indels in phylogenomic analysis
title_short Split-inducing indels in phylogenomic analysis
title_sort split-inducing indels in phylogenomic analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6047143/
https://www.ncbi.nlm.nih.gov/pubmed/30026791
http://dx.doi.org/10.1186/s13015-018-0130-7
work_keys_str_mv AT donathalexander splitinducingindelsinphylogenomicanalysis
AT stadlerpeterf splitinducingindelsinphylogenomicanalysis