Cargando…

miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases

BACKGROUND: MicroRNAs are increasingly being appreciated as critical players in human diseases, and questions concerning the role of microRNAs arise in many areas of biomedical research. There are several manually curated databases of microRNA-disease associations gathered from the biomedical litera...

Descripción completa

Detalles Bibliográficos
Autores principales: Gupta, Samir, Ross, Karen E., Tudor, Catalina O., Wu, Cathy H., Schmidt, Carl J., Vijay-Shanker, K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877743/
https://www.ncbi.nlm.nih.gov/pubmed/27216254
http://dx.doi.org/10.1186/s13326-015-0044-y
_version_ 1782433436019982336
author Gupta, Samir
Ross, Karen E.
Tudor, Catalina O.
Wu, Cathy H.
Schmidt, Carl J.
Vijay-Shanker, K.
author_facet Gupta, Samir
Ross, Karen E.
Tudor, Catalina O.
Wu, Cathy H.
Schmidt, Carl J.
Vijay-Shanker, K.
author_sort Gupta, Samir
collection PubMed
description BACKGROUND: MicroRNAs are increasingly being appreciated as critical players in human diseases, and questions concerning the role of microRNAs arise in many areas of biomedical research. There are several manually curated databases of microRNA-disease associations gathered from the biomedical literature; however, it is difficult for curators of these databases to keep up with the explosion of publications in the microRNA-disease field. Moreover, automated literature mining tools that assist manual curation of microRNA-disease associations currently capture only one microRNA property (expression) in the context of one disease (cancer). Thus, there is a clear need to develop more sophisticated automated literature mining tools that capture a variety of microRNA properties and relations in the context of multiple diseases to provide researchers with fast access to the most recent published information and to streamline and accelerate manual curation. METHODS: We have developed miRiaD (microRNAs in association with Disease), a text-mining tool that automatically extracts associations between microRNAs and diseases from the literature. These associations are often not directly linked, and the intermediate relations are often highly informative for the biomedical researcher. Thus, miRiaD extracts the miR-disease pairs together with an explanation for their association. We also developed a procedure that assigns scores to sentences, marking their informativeness, based on the microRNA-disease relation observed within the sentence. RESULTS: miRiaD was applied to the entire Medline corpus, identifying 8301 PMIDs with miR-disease associations. These abstracts and the miR-disease associations are available for browsing at http://biotm.cis.udel.edu/miRiaD. We evaluated the recall and precision of miRiaD with respect to information of high interest to public microRNA-disease database curators (expression and target gene associations), obtaining a recall of 88.46–90.78. When we expanded the evaluation to include sentences with a wide range of microRNA-disease information that may be of interest to biomedical researchers, miRiaD also performed very well with a F-score of 89.4. The informativeness ranking of sentences was evaluated in terms of nDCG (0.977) and correlation metrics (0.678-0.727) when compared to an annotator’s ranked list. CONCLUSIONS: miRiaD, a high performance system that can capture a wide variety of microRNA-disease related information, extends beyond the scope of existing microRNA-disease resources. It can be incorporated into manual curation pipelines and serve as a resource for biomedical researchers interested in the role of microRNAs in disease. In our ongoing work we are developing an improved miRiaD web interface that will facilitate complex queries about microRNA-disease relationships, such as “In what diseases does microRNA regulation of apoptosis play a role?” or “Is there overlap in the sets of genes targeted by microRNAs in different types of dementia?”.” ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-015-0044-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4877743
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48777432016-05-25 miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases Gupta, Samir Ross, Karen E. Tudor, Catalina O. Wu, Cathy H. Schmidt, Carl J. Vijay-Shanker, K. J Biomed Semantics Research BACKGROUND: MicroRNAs are increasingly being appreciated as critical players in human diseases, and questions concerning the role of microRNAs arise in many areas of biomedical research. There are several manually curated databases of microRNA-disease associations gathered from the biomedical literature; however, it is difficult for curators of these databases to keep up with the explosion of publications in the microRNA-disease field. Moreover, automated literature mining tools that assist manual curation of microRNA-disease associations currently capture only one microRNA property (expression) in the context of one disease (cancer). Thus, there is a clear need to develop more sophisticated automated literature mining tools that capture a variety of microRNA properties and relations in the context of multiple diseases to provide researchers with fast access to the most recent published information and to streamline and accelerate manual curation. METHODS: We have developed miRiaD (microRNAs in association with Disease), a text-mining tool that automatically extracts associations between microRNAs and diseases from the literature. These associations are often not directly linked, and the intermediate relations are often highly informative for the biomedical researcher. Thus, miRiaD extracts the miR-disease pairs together with an explanation for their association. We also developed a procedure that assigns scores to sentences, marking their informativeness, based on the microRNA-disease relation observed within the sentence. RESULTS: miRiaD was applied to the entire Medline corpus, identifying 8301 PMIDs with miR-disease associations. These abstracts and the miR-disease associations are available for browsing at http://biotm.cis.udel.edu/miRiaD. We evaluated the recall and precision of miRiaD with respect to information of high interest to public microRNA-disease database curators (expression and target gene associations), obtaining a recall of 88.46–90.78. When we expanded the evaluation to include sentences with a wide range of microRNA-disease information that may be of interest to biomedical researchers, miRiaD also performed very well with a F-score of 89.4. The informativeness ranking of sentences was evaluated in terms of nDCG (0.977) and correlation metrics (0.678-0.727) when compared to an annotator’s ranked list. CONCLUSIONS: miRiaD, a high performance system that can capture a wide variety of microRNA-disease related information, extends beyond the scope of existing microRNA-disease resources. It can be incorporated into manual curation pipelines and serve as a resource for biomedical researchers interested in the role of microRNAs in disease. In our ongoing work we are developing an improved miRiaD web interface that will facilitate complex queries about microRNA-disease relationships, such as “In what diseases does microRNA regulation of apoptosis play a role?” or “Is there overlap in the sets of genes targeted by microRNAs in different types of dementia?”.” ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13326-015-0044-y) contains supplementary material, which is available to authorized users. BioMed Central 2016-04-29 /pmc/articles/PMC4877743/ /pubmed/27216254 http://dx.doi.org/10.1186/s13326-015-0044-y Text en © Gupta et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Gupta, Samir
Ross, Karen E.
Tudor, Catalina O.
Wu, Cathy H.
Schmidt, Carl J.
Vijay-Shanker, K.
miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title_full miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title_fullStr miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title_full_unstemmed miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title_short miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases
title_sort miriad: a text mining tool for detecting associations of micrornas with diseases
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4877743/
https://www.ncbi.nlm.nih.gov/pubmed/27216254
http://dx.doi.org/10.1186/s13326-015-0044-y
work_keys_str_mv AT guptasamir miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases
AT rosskarene miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases
AT tudorcatalinao miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases
AT wucathyh miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases
AT schmidtcarlj miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases
AT vijayshankerk miriadatextminingtoolfordetectingassociationsofmicrornaswithdiseases