Cargando…

Identification of highly related references about gene-disease association

BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Rey-Long, Shih, Chia-Chun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4162969/
https://www.ncbi.nlm.nih.gov/pubmed/25155502
http://dx.doi.org/10.1186/1471-2105-15-286
_version_ 1782334736000090112
author Liu, Rey-Long
Shih, Chia-Chun
author_facet Liu, Rey-Long
Shih, Chia-Chun
author_sort Liu, Rey-Long
collection PubMed
description BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION: CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations.
format Online
Article
Text
id pubmed-4162969
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41629692014-09-14 Identification of highly related references about gene-disease association Liu, Rey-Long Shih, Chia-Chun BMC Bioinformatics Research Article BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION: CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations. BioMed Central 2014-08-25 /pmc/articles/PMC4162969/ /pubmed/25155502 http://dx.doi.org/10.1186/1471-2105-15-286 Text en © Liu and Shih; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Liu, Rey-Long
Shih, Chia-Chun
Identification of highly related references about gene-disease association
title Identification of highly related references about gene-disease association
title_full Identification of highly related references about gene-disease association
title_fullStr Identification of highly related references about gene-disease association
title_full_unstemmed Identification of highly related references about gene-disease association
title_short Identification of highly related references about gene-disease association
title_sort identification of highly related references about gene-disease association
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4162969/
https://www.ncbi.nlm.nih.gov/pubmed/25155502
http://dx.doi.org/10.1186/1471-2105-15-286
work_keys_str_mv AT liureylong identificationofhighlyrelatedreferencesaboutgenediseaseassociation
AT shihchiachun identificationofhighlyrelatedreferencesaboutgenediseaseassociation