Cargando…
PheneBank: a literature-based database of phenotypes
MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease–phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process. RE...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8796364/ https://www.ncbi.nlm.nih.gov/pubmed/34788791 http://dx.doi.org/10.1093/bioinformatics/btab740 |
_version_ | 1784641289021554688 |
---|---|
author | Pilehvar, Mohammad Taher Bernard, Adam Smedley, Damian Collier, Nigel |
author_facet | Pilehvar, Mohammad Taher Bernard, Adam Smedley, Damian Collier, Nigel |
author_sort | Pilehvar, Mohammad Taher |
collection | PubMed |
description | MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease–phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process. RESULTS: PheneBank is a Web-portal for retrieving human phenotype–disease associations that have been text-mined from the whole of Medline. Our approach exploits state-of-the-art machine learning for concept identification by utilizing an expert annotated rare disease corpus from the PMC Text Mining subset. Evaluation of the system for entities is conducted on a gold-standard corpus of rare disease sentences and for associations against the Monarch initiative data. AVAILABILITY AND IMPLEMENTATION: The PheneBank Web-portal freely available at http://www.phenebank.org. Annotated Medline data is available from Zenodo at DOI: 10.5281/zenodo.1408800. Semantic annotation software is freely available for non-commercial use at GitHub: https://github.com/pilehvar/phenebank. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8796364 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-87963642022-01-31 PheneBank: a literature-based database of phenotypes Pilehvar, Mohammad Taher Bernard, Adam Smedley, Damian Collier, Nigel Bioinformatics Applications Notes MOTIVATION: Significant effort has been spent by curators to create coding systems for phenotypes such as the Human Phenotype Ontology, as well as disease–phenotype annotations. We aim to support the discovery of literature-based phenotypes and integrate them into the knowledge discovery process. RESULTS: PheneBank is a Web-portal for retrieving human phenotype–disease associations that have been text-mined from the whole of Medline. Our approach exploits state-of-the-art machine learning for concept identification by utilizing an expert annotated rare disease corpus from the PMC Text Mining subset. Evaluation of the system for entities is conducted on a gold-standard corpus of rare disease sentences and for associations against the Monarch initiative data. AVAILABILITY AND IMPLEMENTATION: The PheneBank Web-portal freely available at http://www.phenebank.org. Annotated Medline data is available from Zenodo at DOI: 10.5281/zenodo.1408800. Semantic annotation software is freely available for non-commercial use at GitHub: https://github.com/pilehvar/phenebank. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-11-12 /pmc/articles/PMC8796364/ /pubmed/34788791 http://dx.doi.org/10.1093/bioinformatics/btab740 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Applications Notes Pilehvar, Mohammad Taher Bernard, Adam Smedley, Damian Collier, Nigel PheneBank: a literature-based database of phenotypes |
title | PheneBank: a literature-based database of phenotypes |
title_full | PheneBank: a literature-based database of phenotypes |
title_fullStr | PheneBank: a literature-based database of phenotypes |
title_full_unstemmed | PheneBank: a literature-based database of phenotypes |
title_short | PheneBank: a literature-based database of phenotypes |
title_sort | phenebank: a literature-based database of phenotypes |
topic | Applications Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8796364/ https://www.ncbi.nlm.nih.gov/pubmed/34788791 http://dx.doi.org/10.1093/bioinformatics/btab740 |
work_keys_str_mv | AT pilehvarmohammadtaher phenebankaliteraturebaseddatabaseofphenotypes AT bernardadam phenebankaliteraturebaseddatabaseofphenotypes AT smedleydamian phenebankaliteraturebaseddatabaseofphenotypes AT colliernigel phenebankaliteraturebaseddatabaseofphenotypes |