Cargando…
Prediction of RNA secondary structure by maximizing pseudo-expected accuracy
BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators hav...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3003279/ https://www.ncbi.nlm.nih.gov/pubmed/21118522 http://dx.doi.org/10.1186/1471-2105-11-586 |
_version_ | 1782193851014840320 |
---|---|
author | Hamada, Michiaki Sato, Kengo Asai, Kiyoshi |
author_facet | Hamada, Michiaki Sato, Kengo Asai, Kiyoshi |
author_sort | Hamada, Michiaki |
collection | PubMed |
description | BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. RESULTS: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. CONCLUSIONS: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score. |
format | Text |
id | pubmed-3003279 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-30032792011-01-06 Prediction of RNA secondary structure by maximizing pseudo-expected accuracy Hamada, Michiaki Sato, Kengo Asai, Kiyoshi BMC Bioinformatics Research Article BACKGROUND: Recent studies have revealed the importance of considering the entire distribution of possible secondary structures in RNA secondary structure predictions; therefore, a new type of estimator is proposed including the maximum expected accuracy (MEA) estimator. The MEA-based estimators have been designed to maximize the expected accuracy of the base-pairs and have achieved the highest level of accuracy. Those methods, however, do not give the single best prediction of the structure, but employ parameters to control the trade-off between the sensitivity and the positive predictive value (PPV). It is unclear what parameter value we should use, and even the well-trained default parameter value does not, in general, give the best result in popular accuracy measures to each RNA sequence. RESULTS: Instead of using the expected values of the popular accuracy measures for RNA secondary structure prediction, which is difficult to be calculated, the pseudo-expected accuracy, which can easily be computed from base-pairing probabilities, is introduced. It is shown that the pseudo-expected accuracy is a good approximation in terms of sensitivity, PPV, MCC, or F-score. The pseudo-expected accuracy can be approximately maximized for each RNA sequence by stochastic sampling. It is also shown that well-balanced secondary structures between sensitivity and PPV can be predicted with a small computational overhead by combining the pseudo-expected accuracy of MCC or F-score with the γ-centroid estimator. CONCLUSIONS: This study gives not only a method for predicting the secondary structure that balances between sensitivity and PPV, but also a general method for approximately maximizing the (pseudo-)expected accuracy with respect to various evaluation measures including MCC and F-score. BioMed Central 2010-11-30 /pmc/articles/PMC3003279/ /pubmed/21118522 http://dx.doi.org/10.1186/1471-2105-11-586 Text en Copyright ©2010 Hamada et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Hamada, Michiaki Sato, Kengo Asai, Kiyoshi Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title | Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title_full | Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title_fullStr | Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title_full_unstemmed | Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title_short | Prediction of RNA secondary structure by maximizing pseudo-expected accuracy |
title_sort | prediction of rna secondary structure by maximizing pseudo-expected accuracy |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3003279/ https://www.ncbi.nlm.nih.gov/pubmed/21118522 http://dx.doi.org/10.1186/1471-2105-11-586 |
work_keys_str_mv | AT hamadamichiaki predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy AT satokengo predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy AT asaikiyoshi predictionofrnasecondarystructurebymaximizingpseudoexpectedaccuracy |