Cargando…

Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators

Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary struc...

Descripción completa

Detalles Bibliográficos
Autores principales: Okada, Yohei, Saito, Yutaka, Sato, Kengo, Sakakibara, Yasubumi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Research Foundation 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268607/
https://www.ncbi.nlm.nih.gov/pubmed/22303350
http://dx.doi.org/10.3389/fgene.2011.00054
_version_ 1782222389685256192
author Okada, Yohei
Saito, Yutaka
Sato, Kengo
Sakakibara, Yasubumi
author_facet Okada, Yohei
Saito, Yutaka
Sato, Kengo
Sakakibara, Yasubumi
author_sort Okada, Yohei
collection PubMed
description Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD.
format Online
Article
Text
id pubmed-3268607
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Frontiers Research Foundation
record_format MEDLINE/PubMed
spelling pubmed-32686072012-02-02 Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators Okada, Yohei Saito, Yutaka Sato, Kengo Sakakibara, Yasubumi Front Genet Genetics Identification of non-protein-coding RNAs (ncRNAs) in genomes is a crucial task for not only molecular cell biology but also bioinformatics. Secondary structures of ncRNAs are employed as a key feature of ncRNA analysis since biological functions of ncRNAs are deeply related to their secondary structures. Although the minimum free energy (MFE) structure of an RNA sequence is regarded as the most stable structure, MFE alone could not be an appropriate measure for identifying ncRNAs since the free energy is heavily biased by the nucleotide composition. Therefore, instead of MFE itself, several alternative measures for identifying ncRNAs have been proposed such as the structure conservation index (SCI) and the base pair distance (BPD), both of which employ MFE structures. However, these measurements are unfortunately not suitable for identifying ncRNAs in some cases including the genome-wide search and incur high false discovery rate. In this study, we propose improved measurements based on SCI and BPD, applying generalized centroid estimators to incorporate the robustness against low quality multiple alignments. Our experiments show that our proposed methods achieve higher accuracy than the original SCI and BPD for not only human-curated structural alignments but also low quality alignments produced by CLUSTAL W. Furthermore, the centroid-based SCI on CLUSTAL W alignments is more accurate than or comparable with that of the original SCI on structural alignments generated with RAF, a high quality structural aligner, for which twofold expensive computational time is required on average. We conclude that our methods are more suitable for genome-wide alignments which are of low quality from the point of view on secondary structures than the original SCI and BPD. Frontiers Research Foundation 2011-08-31 /pmc/articles/PMC3268607/ /pubmed/22303350 http://dx.doi.org/10.3389/fgene.2011.00054 Text en Copyright © 2011 Okada, Saito, Sato and Sakakibara. http://www.frontiersin.org/licenseagreement This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.
spellingShingle Genetics
Okada, Yohei
Saito, Yutaka
Sato, Kengo
Sakakibara, Yasubumi
Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title_full Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title_fullStr Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title_full_unstemmed Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title_short Improved Measurements of RNA Structure Conservation with Generalized Centroid Estimators
title_sort improved measurements of rna structure conservation with generalized centroid estimators
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268607/
https://www.ncbi.nlm.nih.gov/pubmed/22303350
http://dx.doi.org/10.3389/fgene.2011.00054
work_keys_str_mv AT okadayohei improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT saitoyutaka improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT satokengo improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators
AT sakakibarayasubumi improvedmeasurementsofrnastructureconservationwithgeneralizedcentroidestimators