Cargando…

MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress

We describe initial results of miRNA sequence analysis with the optimal symbol compression ratio (OSCR) algorithm and recast this grammar inference algorithm as an improved minimum description length (MDL) learning tool: MDLcompress. We apply this tool to explore the relationship between miRNAs, sin...

Descripción completa

Detalles Bibliográficos
Autores principales: Evans, Scott C, Kourtidis, Antonis, Markham, T Stephen, Miller, Jonathan, Conklin, Douglas S, Torres, Andrew S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171339/
https://www.ncbi.nlm.nih.gov/pubmed/18317504
http://dx.doi.org/10.1186/1687-4153-2007-43670
_version_ 1782211740217376768
author Evans, Scott C
Kourtidis, Antonis
Markham, T Stephen
Miller, Jonathan
Conklin, Douglas S
Torres, Andrew S
author_facet Evans, Scott C
Kourtidis, Antonis
Markham, T Stephen
Miller, Jonathan
Conklin, Douglas S
Torres, Andrew S
author_sort Evans, Scott C
collection PubMed
description We describe initial results of miRNA sequence analysis with the optimal symbol compression ratio (OSCR) algorithm and recast this grammar inference algorithm as an improved minimum description length (MDL) learning tool: MDLcompress. We apply this tool to explore the relationship between miRNAs, single nucleotide polymorphisms (SNPs), and breast cancer. Our new algorithm outperforms other grammar-based coding methods, such as DNA Sequitur, while retaining a two-part code that highlights biologically significant phrases. The deep recursion of MDLcompress, together with its explicit two-part coding, enables it to identify biologically meaningful sequence without needlessly restrictive priors. The ability to quantify cost in bits for phrases in the MDL model allows prediction of regions where SNPs may have the most impact on biological activity. MDLcompress improves on our previous algorithm in execution time through an innovative data structure, and in specificity of motif detection (compression) through improved heuristics. An MDLcompress analysis of 144 over expressed genes from the breast cancer cell line BT474 has identified novel motifs, including potential microRNA (miRNA) binding sites that are candidates for experimental validation.
format Online
Article
Text
id pubmed-3171339
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Springer
record_format MEDLINE/PubMed
spelling pubmed-31713392011-09-13 MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress Evans, Scott C Kourtidis, Antonis Markham, T Stephen Miller, Jonathan Conklin, Douglas S Torres, Andrew S EURASIP J Bioinform Syst Biol Research Article We describe initial results of miRNA sequence analysis with the optimal symbol compression ratio (OSCR) algorithm and recast this grammar inference algorithm as an improved minimum description length (MDL) learning tool: MDLcompress. We apply this tool to explore the relationship between miRNAs, single nucleotide polymorphisms (SNPs), and breast cancer. Our new algorithm outperforms other grammar-based coding methods, such as DNA Sequitur, while retaining a two-part code that highlights biologically significant phrases. The deep recursion of MDLcompress, together with its explicit two-part coding, enables it to identify biologically meaningful sequence without needlessly restrictive priors. The ability to quantify cost in bits for phrases in the MDL model allows prediction of regions where SNPs may have the most impact on biological activity. MDLcompress improves on our previous algorithm in execution time through an innovative data structure, and in specificity of motif detection (compression) through improved heuristics. An MDLcompress analysis of 144 over expressed genes from the breast cancer cell line BT474 has identified novel motifs, including potential microRNA (miRNA) binding sites that are candidates for experimental validation. Springer 2007-10-30 /pmc/articles/PMC3171339/ /pubmed/18317504 http://dx.doi.org/10.1186/1687-4153-2007-43670 Text en Copyright © 2007 General Electric Company. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Evans, Scott C
Kourtidis, Antonis
Markham, T Stephen
Miller, Jonathan
Conklin, Douglas S
Torres, Andrew S
MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title_full MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title_fullStr MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title_full_unstemmed MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title_short MicroRNA Target Detection and Analysis for Genes Related to Breast Cancer Using MDLcompress
title_sort microrna target detection and analysis for genes related to breast cancer using mdlcompress
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3171339/
https://www.ncbi.nlm.nih.gov/pubmed/18317504
http://dx.doi.org/10.1186/1687-4153-2007-43670
work_keys_str_mv AT evansscottc micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress
AT kourtidisantonis micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress
AT markhamtstephen micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress
AT millerjonathan micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress
AT conklindouglass micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress
AT torresandrews micrornatargetdetectionandanalysisforgenesrelatedtobreastcancerusingmdlcompress