Cargando…

Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences

Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce t...

Descripción completa

Detalles Bibliográficos
Autores principales: Strejcek, Michal, Wang, Qiong, Ridl, Jakub, Uhlik, Ondrej
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4656815/
https://www.ncbi.nlm.nih.gov/pubmed/26635739
http://dx.doi.org/10.3389/fmicb.2015.01267
_version_ 1782402282199973888
author Strejcek, Michal
Wang, Qiong
Ridl, Jakub
Uhlik, Ondrej
author_facet Strejcek, Michal
Wang, Qiong
Ridl, Jakub
Uhlik, Ondrej
author_sort Strejcek, Michal
collection PubMed
description Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frameshifts (FS). Genes encoding for alpha subunits of biphenyl (bphA) and benzoate (benA) dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 44% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of maximum expected error filtering and single linkage pre-clustering proved to be the most efficient read processing approach. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study or available at https://github.com/strejcem/FBdenovo. The tool was also implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/.
format Online
Article
Text
id pubmed-4656815
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-46568152015-12-03 Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences Strejcek, Michal Wang, Qiong Ridl, Jakub Uhlik, Ondrej Front Microbiol Microbiology Functional gene ecological analyses using amplicon sequencing can be challenging as translated sequences are often burdened with shifted reading frames. The aim of this work was to evaluate several bioinformatics tools designed to correct errors which arise during sequencing in an effort to reduce the number of frameshifts (FS). Genes encoding for alpha subunits of biphenyl (bphA) and benzoate (benA) dioxygenases were used as model sequences. FrameBot, a FS correction tool, was able to reduce the number of detected FS to zero. However, up to 44% of sequences were discarded by FrameBot as non-specific targets. Therefore, we proposed a de novo mode of FrameBot for FS correction, which works on a similar basis as common chimera identifying platforms and is not dependent on reference sequences. By nature of FrameBot de novo design, it is crucial to provide it with data as error free as possible. We tested the ability of several publicly available correction tools to decrease the number of errors in the data sets. The combination of maximum expected error filtering and single linkage pre-clustering proved to be the most efficient read processing approach. Applying FrameBot de novo on the processed data enabled analysis of BphA sequences with minimal losses of potentially functional sequences not homologous to those previously known. This experiment also demonstrated the extensive diversity of dioxygenases in soil. A script which performs FrameBot de novo is presented in the supplementary material to the study or available at https://github.com/strejcem/FBdenovo. The tool was also implemented into FunGene Pipeline available at http://fungene.cme.msu.edu/FunGenePipeline/. Frontiers Media S.A. 2015-11-24 /pmc/articles/PMC4656815/ /pubmed/26635739 http://dx.doi.org/10.3389/fmicb.2015.01267 Text en Copyright © 2015 Strejcek, Wang, Ridl and Uhlik. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Microbiology
Strejcek, Michal
Wang, Qiong
Ridl, Jakub
Uhlik, Ondrej
Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title_full Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title_fullStr Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title_full_unstemmed Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title_short Hunting Down Frame Shifts: Ecological Analysis of Diverse Functional Gene Sequences
title_sort hunting down frame shifts: ecological analysis of diverse functional gene sequences
topic Microbiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4656815/
https://www.ncbi.nlm.nih.gov/pubmed/26635739
http://dx.doi.org/10.3389/fmicb.2015.01267
work_keys_str_mv AT strejcekmichal huntingdownframeshiftsecologicalanalysisofdiversefunctionalgenesequences
AT wangqiong huntingdownframeshiftsecologicalanalysisofdiversefunctionalgenesequences
AT ridljakub huntingdownframeshiftsecologicalanalysisofdiversefunctionalgenesequences
AT uhlikondrej huntingdownframeshiftsecologicalanalysisofdiversefunctionalgenesequences