Cargando…

Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy

BACKGROUND: NCRNAs (noncoding RNAs) play important roles in many biological processes. Existing genome-scale ncRNA search tools identify ncRNAs in local sequence alignments generated by conventional sequence comparison methods. However, some types of ncRNA lack strong sequence conservation and tend...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Yanni, Aljawad, Osama, Lei, Jikai, Liu, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3311100/
https://www.ncbi.nlm.nih.gov/pubmed/22536896
http://dx.doi.org/10.1186/1471-2105-13-S3-S12
_version_ 1782227746651373568
author Sun, Yanni
Aljawad, Osama
Lei, Jikai
Liu, Alex
author_facet Sun, Yanni
Aljawad, Osama
Lei, Jikai
Liu, Alex
author_sort Sun, Yanni
collection PubMed
description BACKGROUND: NCRNAs (noncoding RNAs) play important roles in many biological processes. Existing genome-scale ncRNA search tools identify ncRNAs in local sequence alignments generated by conventional sequence comparison methods. However, some types of ncRNA lack strong sequence conservation and tend to be missed or mis-aligned by conventional sequence comparison. RESULTS: In this paper, we propose an ncRNA identification framework that is complementary to existing sequence comparison tools. By integrating a filtration step based on Hamming distance and ncRNA alignment programs such as FOLDALIGN or PLAST-ncRNA, the proposed ncRNA search framework can identify ncRNAs that lack strong sequence conservation. In addition, as the ratio of transition and transversion mutation is often used as a discriminative feature for functional ncRNA identification, we incorporate this feature into the filtration step using a coding strategy. We apply Hamming distance seeds to ncRNA search in the intergenic regions of human and mouse genomes and between the Burkholderia cenocepacia J2315 genome and the Ralstonia solanacearum genome. The experimental results demonstrate that a carefully designed Hamming distance seed can achieve better sensitivity in searching for poorly conserved ncRNAs than conventional sequence comparison tools. CONCLUSIONS: Hamming distance seeds provide better sensitivity as a filtration strategy for genome-wide ncRNA homology search than the existing seeding strategies used in BLAST-like tools. By combining Hamming distance seeds matching and ncRNA alignment, we are able to find ncRNAs with sequence similarities below 60%.
format Online
Article
Text
id pubmed-3311100
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33111002012-04-02 Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy Sun, Yanni Aljawad, Osama Lei, Jikai Liu, Alex BMC Bioinformatics Proceedings BACKGROUND: NCRNAs (noncoding RNAs) play important roles in many biological processes. Existing genome-scale ncRNA search tools identify ncRNAs in local sequence alignments generated by conventional sequence comparison methods. However, some types of ncRNA lack strong sequence conservation and tend to be missed or mis-aligned by conventional sequence comparison. RESULTS: In this paper, we propose an ncRNA identification framework that is complementary to existing sequence comparison tools. By integrating a filtration step based on Hamming distance and ncRNA alignment programs such as FOLDALIGN or PLAST-ncRNA, the proposed ncRNA search framework can identify ncRNAs that lack strong sequence conservation. In addition, as the ratio of transition and transversion mutation is often used as a discriminative feature for functional ncRNA identification, we incorporate this feature into the filtration step using a coding strategy. We apply Hamming distance seeds to ncRNA search in the intergenic regions of human and mouse genomes and between the Burkholderia cenocepacia J2315 genome and the Ralstonia solanacearum genome. The experimental results demonstrate that a carefully designed Hamming distance seed can achieve better sensitivity in searching for poorly conserved ncRNAs than conventional sequence comparison tools. CONCLUSIONS: Hamming distance seeds provide better sensitivity as a filtration strategy for genome-wide ncRNA homology search than the existing seeding strategies used in BLAST-like tools. By combining Hamming distance seeds matching and ncRNA alignment, we are able to find ncRNAs with sequence similarities below 60%. BioMed Central 2012-03-21 /pmc/articles/PMC3311100/ /pubmed/22536896 http://dx.doi.org/10.1186/1471-2105-13-S3-S12 Text en Copyright ©2012 Sun et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Sun, Yanni
Aljawad, Osama
Lei, Jikai
Liu, Alex
Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title_full Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title_fullStr Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title_full_unstemmed Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title_short Genome-scale NCRNA homology search using a Hamming distance-based filtration strategy
title_sort genome-scale ncrna homology search using a hamming distance-based filtration strategy
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3311100/
https://www.ncbi.nlm.nih.gov/pubmed/22536896
http://dx.doi.org/10.1186/1471-2105-13-S3-S12
work_keys_str_mv AT sunyanni genomescalencrnahomologysearchusingahammingdistancebasedfiltrationstrategy
AT aljawadosama genomescalencrnahomologysearchusingahammingdistancebasedfiltrationstrategy
AT leijikai genomescalencrnahomologysearchusingahammingdistancebasedfiltrationstrategy
AT liualex genomescalencrnahomologysearchusingahammingdistancebasedfiltrationstrategy