Cargando…

Séance: reference-based phylogenetic analysis for 18S rRNA studies

BACKGROUND: Marker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short hi...

Descripción completa

Detalles Bibliográficos
Autores principales: Medlar, Alan, Aivelo, Tuomas, Löytynoja, Ari
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4265393/
https://www.ncbi.nlm.nih.gov/pubmed/25433763
http://dx.doi.org/10.1186/s12862-014-0235-7
_version_ 1782348879335784448
author Medlar, Alan
Aivelo, Tuomas
Löytynoja, Ari
author_facet Medlar, Alan
Aivelo, Tuomas
Löytynoja, Ari
author_sort Medlar, Alan
collection PubMed
description BACKGROUND: Marker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short high-throughput sequencing reads precludes accurate phylogenetic analysis. This is particularly unfortunate in the study of microscopic eukaryotes where horizontal gene flow is limited and the rRNA gene is expected to accurately reflect the species phylogeny. A promising alternative to full phylogenetic analysis is phylogenetic placement, where a reference phylogeny is inferred using the complete marker gene and iteratively extended with the short sequences from a metagenetic sample under study. RESULTS: Based on the phylogenetic placement approach we built Séance, a community analysis pipeline focused on the analysis of 18S marker gene data. Séance combines the alignment extension and phylogenetic placement capabilities of the Pagan multiple sequence alignment program with a suite of tools to preprocess, cluster and visualise datasets composed of many samples. We showcase Séance by analysing 454 data from a longitudinal study of intestinal parasite communities in wild rufous mouse lemurs (Microcebus rufus) as well as in simulation. We demonstrate both improved OTU picking at higher levels of sequence similarity for 454 data and show the accuracy of phylogenetic placement to be comparable to maximum likelihood methods for lower numbers of taxa. CONCLUSIONS: Séance is an open source community analysis pipeline that provides reference-based phylogenetic analysis for rRNA marker gene studies. Whilst in this article we focus on studying nematodes using the 18S marker gene, the concepts are generic and reference data for alternative marker genes can be easily created. Séance can be downloaded from http://wasabiapp.org/software/seance/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12862-014-0235-7) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4265393
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42653932014-12-15 Séance: reference-based phylogenetic analysis for 18S rRNA studies Medlar, Alan Aivelo, Tuomas Löytynoja, Ari BMC Evol Biol Software BACKGROUND: Marker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short high-throughput sequencing reads precludes accurate phylogenetic analysis. This is particularly unfortunate in the study of microscopic eukaryotes where horizontal gene flow is limited and the rRNA gene is expected to accurately reflect the species phylogeny. A promising alternative to full phylogenetic analysis is phylogenetic placement, where a reference phylogeny is inferred using the complete marker gene and iteratively extended with the short sequences from a metagenetic sample under study. RESULTS: Based on the phylogenetic placement approach we built Séance, a community analysis pipeline focused on the analysis of 18S marker gene data. Séance combines the alignment extension and phylogenetic placement capabilities of the Pagan multiple sequence alignment program with a suite of tools to preprocess, cluster and visualise datasets composed of many samples. We showcase Séance by analysing 454 data from a longitudinal study of intestinal parasite communities in wild rufous mouse lemurs (Microcebus rufus) as well as in simulation. We demonstrate both improved OTU picking at higher levels of sequence similarity for 454 data and show the accuracy of phylogenetic placement to be comparable to maximum likelihood methods for lower numbers of taxa. CONCLUSIONS: Séance is an open source community analysis pipeline that provides reference-based phylogenetic analysis for rRNA marker gene studies. Whilst in this article we focus on studying nematodes using the 18S marker gene, the concepts are generic and reference data for alternative marker genes can be easily created. Séance can be downloaded from http://wasabiapp.org/software/seance/. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12862-014-0235-7) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-30 /pmc/articles/PMC4265393/ /pubmed/25433763 http://dx.doi.org/10.1186/s12862-014-0235-7 Text en © Medlar et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Medlar, Alan
Aivelo, Tuomas
Löytynoja, Ari
Séance: reference-based phylogenetic analysis for 18S rRNA studies
title Séance: reference-based phylogenetic analysis for 18S rRNA studies
title_full Séance: reference-based phylogenetic analysis for 18S rRNA studies
title_fullStr Séance: reference-based phylogenetic analysis for 18S rRNA studies
title_full_unstemmed Séance: reference-based phylogenetic analysis for 18S rRNA studies
title_short Séance: reference-based phylogenetic analysis for 18S rRNA studies
title_sort séance: reference-based phylogenetic analysis for 18s rrna studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4265393/
https://www.ncbi.nlm.nih.gov/pubmed/25433763
http://dx.doi.org/10.1186/s12862-014-0235-7
work_keys_str_mv AT medlaralan seancereferencebasedphylogeneticanalysisfor18srrnastudies
AT aivelotuomas seancereferencebasedphylogeneticanalysisfor18srrnastudies
AT loytynojaari seancereferencebasedphylogeneticanalysisfor18srrnastudies