Cargando…

Prediction of RNA secondary structure by maximizing pseudo-expected accuracy

BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators hav...

Descripción completa

Detalles Bibliográficos
Autores principales: Hamada, Michiaki, Sato, Kengo, Asai, Kiyoshi
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3003279/
https://www.ncbi.nlm.nih.gov/pubmed/21118522
http://dx.doi.org/10.1186/1471-2105-11-586
_version_ 1782193851014840320
author Hamada, Michiaki
Sato, Kengo
Asai, Kiyoshi
author_facet Hamada, Michiaki
Sato, Kengo
Asai, Kiyoshi
author_sort Hamada, Michiaki
collection PubMed
description BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. RESULTS: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. CONCLUSIONS: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score.
format Text
id pubmed-3003279
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30032792011-01-06 Prediction of RNA secondary structure by maximizing pseudo-expected accuracy Hamada, Michiaki Sato, Kengo Asai, Kiyoshi BMC Bioinformatics Research Article BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. RESULTS: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. CONCLUSIONS: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score. BioMed Central 2010-11-30 /pmc/articles/PMC3003279/ /pubmed/21118522 http://dx.doi.org/10.1186/1471-2105-11-586 Text en Copyright ©2010 Hamada et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Hamada, Michiaki
Sato, Kengo
Asai, Kiyoshi
Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title_full Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title_fullStr Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title_full_unstemmed Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title_short Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
title_sort prediction of rna secondary structure by maximizing pseudo-expected accuracy
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3003279/
https://www.ncbi.nlm.nih.gov/pubmed/21118522
http://dx.doi.org/10.1186/1471-2105-11-586
work_keys_str_mv AT hamadamichiaki predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy
AT satokengo predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy
AT asaikiyoshi predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy