Cargando…

PhenoMiner: from text to a database of phenotypes associated with OMIM diseases

Analysis of scientific and clinical phenotypes reported in the experimental literature has been curated manually to build high-quality databases such as the Online Mendelian Inheritance in Man (OMIM). However, the identification and harmonization of phenotype descriptions struggles with the diversit...

Descripción completa

Detalles Bibliográficos
Autores principales: Collier, Nigel, Groza, Tudor, Smedley, Damian, Robinson, Peter N., Oellrich, Anika, Rebholz-Schuhmann, Dietrich
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4622021/
https://www.ncbi.nlm.nih.gov/pubmed/26507285
http://dx.doi.org/10.1093/database/bav104
_version_ 1782397530672201728
author Collier, Nigel
Groza, Tudor
Smedley, Damian
Robinson, Peter N.
Oellrich, Anika
Rebholz-Schuhmann, Dietrich
author_facet Collier, Nigel
Groza, Tudor
Smedley, Damian
Robinson, Peter N.
Oellrich, Anika
Rebholz-Schuhmann, Dietrich
author_sort Collier, Nigel
collection PubMed
description Analysis of scientific and clinical phenotypes reported in the experimental literature has been curated manually to build high-quality databases such as the Online Mendelian Inheritance in Man (OMIM). However, the identification and harmonization of phenotype descriptions struggles with the diversity of human expressivity. We introduce a novel automated extraction approach called PhenoMiner that exploits full parsing and conceptual analysis. Apriori association mining is then used to identify relationships to human diseases. We applied PhenoMiner to the BMC open access collection and identified 13 636 phenotype candidates. We identified 28 155 phenotype-disorder hypotheses covering 4898 phenotypes and 1659 Mendelian disorders. Analysis showed: (i) the semantic distribution of the extracted terms against linked ontologies; (ii) a comparison of term overlap with the Human Phenotype Ontology (HP); (iii) moderate support for phenotype-disorder pairs in both OMIM and the literature; (iv) strong associations of phenotype-disorder pairs to known disease-genes pairs using PhenoDigm. The full list of PhenoMiner phenotypes (S1), phenotype-disorder associations (S2), association-filtered linked data (S3) and user database documentation (S5) is available as supplementary data and can be downloaded at http://github.com/nhcollier/PhenoMiner under a Creative Commons Attribution 4.0 license. Database URL: phenominer.mml.cam.ac.uk
format Online
Article
Text
id pubmed-4622021
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-46220212015-10-28 PhenoMiner: from text to a database of phenotypes associated with OMIM diseases Collier, Nigel Groza, Tudor Smedley, Damian Robinson, Peter N. Oellrich, Anika Rebholz-Schuhmann, Dietrich Database (Oxford) Original Article Analysis of scientific and clinical phenotypes reported in the experimental literature has been curated manually to build high-quality databases such as the Online Mendelian Inheritance in Man (OMIM). However, the identification and harmonization of phenotype descriptions struggles with the diversity of human expressivity. We introduce a novel automated extraction approach called PhenoMiner that exploits full parsing and conceptual analysis. Apriori association mining is then used to identify relationships to human diseases. We applied PhenoMiner to the BMC open access collection and identified 13 636 phenotype candidates. We identified 28 155 phenotype-disorder hypotheses covering 4898 phenotypes and 1659 Mendelian disorders. Analysis showed: (i) the semantic distribution of the extracted terms against linked ontologies; (ii) a comparison of term overlap with the Human Phenotype Ontology (HP); (iii) moderate support for phenotype-disorder pairs in both OMIM and the literature; (iv) strong associations of phenotype-disorder pairs to known disease-genes pairs using PhenoDigm. The full list of PhenoMiner phenotypes (S1), phenotype-disorder associations (S2), association-filtered linked data (S3) and user database documentation (S5) is available as supplementary data and can be downloaded at http://github.com/nhcollier/PhenoMiner under a Creative Commons Attribution 4.0 license. Database URL: phenominer.mml.cam.ac.uk Oxford University Press 2015-10-27 /pmc/articles/PMC4622021/ /pubmed/26507285 http://dx.doi.org/10.1093/database/bav104 Text en © The Author(s) 2015. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Collier, Nigel
Groza, Tudor
Smedley, Damian
Robinson, Peter N.
Oellrich, Anika
Rebholz-Schuhmann, Dietrich
PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title_full PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title_fullStr PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title_full_unstemmed PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title_short PhenoMiner: from text to a database of phenotypes associated with OMIM diseases
title_sort phenominer: from text to a database of phenotypes associated with omim diseases
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4622021/
https://www.ncbi.nlm.nih.gov/pubmed/26507285
http://dx.doi.org/10.1093/database/bav104
work_keys_str_mv AT colliernigel phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases
AT grozatudor phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases
AT smedleydamian phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases
AT robinsonpetern phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases
AT oellrichanika phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases
AT rebholzschuhmanndietrich phenominerfromtexttoadatabaseofphenotypesassociatedwithomimdiseases