Cargando…

LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system

BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These appl...

Descripción completa

Detalles Bibliográficos
Autores principales: Vanhoutreve, Renaud, Kress, Arnaud, Legrand, Baptiste, Gass, Hélène, Poch, Olivier, Thompson, Julie D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4936259/
https://www.ncbi.nlm.nih.gov/pubmed/27387560
http://dx.doi.org/10.1186/s12859-016-1146-y
_version_ 1782441537198620672
author Vanhoutreve, Renaud
Kress, Arnaud
Legrand, Baptiste
Gass, Hélène
Poch, Olivier
Thompson, Julie D.
author_facet Vanhoutreve, Renaud
Kress, Arnaud
Legrand, Baptiste
Gass, Hélène
Poch, Olivier
Thompson, Julie D.
author_sort Vanhoutreve, Renaud
collection PubMed
description BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These applications, however sophisticated, are generally highly sensitive to the alignment used, and neglecting non-homologous or uncertain regions in the alignment can lead to significant bias in the subsequent inferences. RESULTS: Here, we present a new method, LEON-BIS, which uses a robust Bayesian framework to estimate the homologous relations between sequences in a protein multiple alignment. Sequences are clustered into sub-families and relations are predicted at different levels, including ‘core blocks’, ‘regions’ and full-length proteins. The accuracy and reliability of the predictions are demonstrated in large-scale comparisons using well annotated alignment databases, where the homologous sequence segments are detected with very high sensitivity and specificity. CONCLUSIONS: LEON-BIS uses robust Bayesian statistics to distinguish the portions of multiple sequence alignments that are conserved either across the whole family or within subfamilies. LEON-BIS should thus be useful for automatic, high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc.
format Online
Article
Text
id pubmed-4936259
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49362592016-07-07 LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system Vanhoutreve, Renaud Kress, Arnaud Legrand, Baptiste Gass, Hélène Poch, Olivier Thompson, Julie D. BMC Bioinformatics Methodology Article BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These applications, however sophisticated, are generally highly sensitive to the alignment used, and neglecting non-homologous or uncertain regions in the alignment can lead to significant bias in the subsequent inferences. RESULTS: Here, we present a new method, LEON-BIS, which uses a robust Bayesian framework to estimate the homologous relations between sequences in a protein multiple alignment. Sequences are clustered into sub-families and relations are predicted at different levels, including ‘core blocks’, ‘regions’ and full-length proteins. The accuracy and reliability of the predictions are demonstrated in large-scale comparisons using well annotated alignment databases, where the homologous sequence segments are detected with very high sensitivity and specificity. CONCLUSIONS: LEON-BIS uses robust Bayesian statistics to distinguish the portions of multiple sequence alignments that are conserved either across the whole family or within subfamilies. LEON-BIS should thus be useful for automatic, high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc. BioMed Central 2016-07-07 /pmc/articles/PMC4936259/ /pubmed/27387560 http://dx.doi.org/10.1186/s12859-016-1146-y Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Vanhoutreve, Renaud
Kress, Arnaud
Legrand, Baptiste
Gass, Hélène
Poch, Olivier
Thompson, Julie D.
LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title_full LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title_fullStr LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title_full_unstemmed LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title_short LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
title_sort leon-bis: multiple alignment evaluation of sequence neighbours using a bayesian inference system
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4936259/
https://www.ncbi.nlm.nih.gov/pubmed/27387560
http://dx.doi.org/10.1186/s12859-016-1146-y
work_keys_str_mv AT vanhoutreverenaud leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem
AT kressarnaud leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem
AT legrandbaptiste leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem
AT gasshelene leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem
AT pocholivier leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem
AT thompsonjulied leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem