Cargando…
Identification of highly related references about gene-disease association
BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We presen...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4162969/ https://www.ncbi.nlm.nih.gov/pubmed/25155502 http://dx.doi.org/10.1186/1471-2105-15-286 |
_version_ | 1782334736000090112 |
---|---|
author | Liu, Rey-Long Shih, Chia-Chun |
author_facet | Liu, Rey-Long Shih, Chia-Chun |
author_sort | Liu, Rey-Long |
collection | PubMed |
description | BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION: CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations. |
format | Online Article Text |
id | pubmed-4162969 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-41629692014-09-14 Identification of highly related references about gene-disease association Liu, Rey-Long Shih, Chia-Chun BMC Bioinformatics Research Article BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION: CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations. BioMed Central 2014-08-25 /pmc/articles/PMC4162969/ /pubmed/25155502 http://dx.doi.org/10.1186/1471-2105-15-286 Text en © Liu and Shih; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Liu, Rey-Long Shih, Chia-Chun Identification of highly related references about gene-disease association |
title | Identification of highly related references about gene-disease association |
title_full | Identification of highly related references about gene-disease association |
title_fullStr | Identification of highly related references about gene-disease association |
title_full_unstemmed | Identification of highly related references about gene-disease association |
title_short | Identification of highly related references about gene-disease association |
title_sort | identification of highly related references about gene-disease association |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4162969/ https://www.ncbi.nlm.nih.gov/pubmed/25155502 http://dx.doi.org/10.1186/1471-2105-15-286 |
work_keys_str_mv | AT liureylong identificationofhighlyrelatedreferencesaboutgenediseaseassociation AT shihchiachun identificationofhighlyrelatedreferencesaboutgenediseaseassociation |