Cargando…

Identification of highly related references about gene-disease association

BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We presen...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Rey-Long, Shih, Chia-Chun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4162969/
https://www.ncbi.nlm.nih.gov/pubmed/25155502
http://dx.doi.org/10.1186/1471-2105-15-286
Descripción
Sumario:BACKGROUND: Curation of gene-disease associations published in literature should be based on careful and frequent survey of the references that are highly related to specific gene-disease associations. Retrieval of the references is thus essential for timely and complete curation. RESULTS: We present a technique CRFref (Conclusive, Rich, and Focused References) that, given a gene-disease pair < g, d>, ranks high those biomedical references that are likely to provide conclusive, rich, and focused results about g and d. Such references are expected to be highly related to the association between g and d. CRFref ranks candidate references based on their scores. To estimate the score of a reference r, CRFref estimates and integrates three measures: degree of conclusiveness, degree of richness, and degree of focus of r with respect to < g, d>. To evaluate CRFref, experiments are conducted on over one hundred thousand references for over one thousand gene-disease pairs. Experimental results show that CRFref performs significantly better than several typical types of baselines in ranking high those references that expert curators select to develop the summaries for specific gene-disease associations. CONCLUSION: CRFref is a good technique to rank high those references that are highly related to specific gene-disease associations. It can be incorporated into existing search engines to prioritize biomedical references for curators and researchers, as well as those text mining systems that aim at the study of gene-disease associations.