Cargando…
LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system
BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These appl...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4936259/ https://www.ncbi.nlm.nih.gov/pubmed/27387560 http://dx.doi.org/10.1186/s12859-016-1146-y |
_version_ | 1782441537198620672 |
---|---|
author | Vanhoutreve, Renaud Kress, Arnaud Legrand, Baptiste Gass, Hélène Poch, Olivier Thompson, Julie D. |
author_facet | Vanhoutreve, Renaud Kress, Arnaud Legrand, Baptiste Gass, Hélène Poch, Olivier Thompson, Julie D. |
author_sort | Vanhoutreve, Renaud |
collection | PubMed |
description | BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These applications, however sophisticated, are generally highly sensitive to the alignment used, and neglecting non-homologous or uncertain regions in the alignment can lead to significant bias in the subsequent inferences. RESULTS: Here, we present a new method, LEON-BIS, which uses a robust Bayesian framework to estimate the homologous relations between sequences in a protein multiple alignment. Sequences are clustered into sub-families and relations are predicted at different levels, including ‘core blocks’, ‘regions’ and full-length proteins. The accuracy and reliability of the predictions are demonstrated in large-scale comparisons using well annotated alignment databases, where the homologous sequence segments are detected with very high sensitivity and specificity. CONCLUSIONS: LEON-BIS uses robust Bayesian statistics to distinguish the portions of multiple sequence alignments that are conserved either across the whole family or within subfamilies. LEON-BIS should thus be useful for automatic, high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc. |
format | Online Article Text |
id | pubmed-4936259 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-49362592016-07-07 LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system Vanhoutreve, Renaud Kress, Arnaud Legrand, Baptiste Gass, Hélène Poch, Olivier Thompson, Julie D. BMC Bioinformatics Methodology Article BACKGROUND: A standard procedure in many areas of bioinformatics is to use a multiple sequence alignment (MSA) as the basis for various types of homology-based inference. Applications include 3D structure modelling, protein functional annotation, prediction of molecular interactions, etc. These applications, however sophisticated, are generally highly sensitive to the alignment used, and neglecting non-homologous or uncertain regions in the alignment can lead to significant bias in the subsequent inferences. RESULTS: Here, we present a new method, LEON-BIS, which uses a robust Bayesian framework to estimate the homologous relations between sequences in a protein multiple alignment. Sequences are clustered into sub-families and relations are predicted at different levels, including ‘core blocks’, ‘regions’ and full-length proteins. The accuracy and reliability of the predictions are demonstrated in large-scale comparisons using well annotated alignment databases, where the homologous sequence segments are detected with very high sensitivity and specificity. CONCLUSIONS: LEON-BIS uses robust Bayesian statistics to distinguish the portions of multiple sequence alignments that are conserved either across the whole family or within subfamilies. LEON-BIS should thus be useful for automatic, high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc. BioMed Central 2016-07-07 /pmc/articles/PMC4936259/ /pubmed/27387560 http://dx.doi.org/10.1186/s12859-016-1146-y Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Vanhoutreve, Renaud Kress, Arnaud Legrand, Baptiste Gass, Hélène Poch, Olivier Thompson, Julie D. LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title | LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title_full | LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title_fullStr | LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title_full_unstemmed | LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title_short | LEON-BIS: multiple alignment evaluation of sequence neighbours using a Bayesian inference system |
title_sort | leon-bis: multiple alignment evaluation of sequence neighbours using a bayesian inference system |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4936259/ https://www.ncbi.nlm.nih.gov/pubmed/27387560 http://dx.doi.org/10.1186/s12859-016-1146-y |
work_keys_str_mv | AT vanhoutreverenaud leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem AT kressarnaud leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem AT legrandbaptiste leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem AT gasshelene leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem AT pocholivier leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem AT thompsonjulied leonbismultiplealignmentevaluationofsequenceneighboursusingabayesianinferencesystem |