Cargando…
A new statistical framework to assess structural alignment quality using information compression
Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of stru...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147913/ https://www.ncbi.nlm.nih.gov/pubmed/25161241 http://dx.doi.org/10.1093/bioinformatics/btu460 |
_version_ | 1782332536359223296 |
---|---|
author | Collier, James H. Allison, Lloyd Lesk, Arthur M. Garcia de la Banda, Maria Konagurthu, Arun S. |
author_facet | Collier, James H. Allison, Lloyd Lesk, Arthur M. Garcia de la Banda, Maria Konagurthu, Arun S. |
author_sort | Collier, James H. |
collection | PubMed |
description | Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field. Results: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field. Availability: http://lcb.infotech.monash.edu.au/I-value Contact: arun.konagurthu@monash.edu Supplementary information: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html |
format | Online Article Text |
id | pubmed-4147913 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-41479132014-09-02 A new statistical framework to assess structural alignment quality using information compression Collier, James H. Allison, Lloyd Lesk, Arthur M. Garcia de la Banda, Maria Konagurthu, Arun S. Bioinformatics Eccb 2014 Proceedings Papers Committee Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field. Results: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field. Availability: http://lcb.infotech.monash.edu.au/I-value Contact: arun.konagurthu@monash.edu Supplementary information: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html Oxford University Press 2014-09-01 2014-08-22 /pmc/articles/PMC4147913/ /pubmed/25161241 http://dx.doi.org/10.1093/bioinformatics/btu460 Text en © The Author 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Eccb 2014 Proceedings Papers Committee Collier, James H. Allison, Lloyd Lesk, Arthur M. Garcia de la Banda, Maria Konagurthu, Arun S. A new statistical framework to assess structural alignment quality using information compression |
title | A new statistical framework to assess structural alignment quality using information compression |
title_full | A new statistical framework to assess structural alignment quality using information compression |
title_fullStr | A new statistical framework to assess structural alignment quality using information compression |
title_full_unstemmed | A new statistical framework to assess structural alignment quality using information compression |
title_short | A new statistical framework to assess structural alignment quality using information compression |
title_sort | new statistical framework to assess structural alignment quality using information compression |
topic | Eccb 2014 Proceedings Papers Committee |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147913/ https://www.ncbi.nlm.nih.gov/pubmed/25161241 http://dx.doi.org/10.1093/bioinformatics/btu460 |
work_keys_str_mv | AT collierjamesh anewstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT allisonlloyd anewstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT leskarthurm anewstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT garciadelabandamaria anewstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT konagurthuaruns anewstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT collierjamesh newstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT allisonlloyd newstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT leskarthurm newstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT garciadelabandamaria newstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression AT konagurthuaruns newstatisticalframeworktoassessstructuralalignmentqualityusinginformationcompression |