Cargando…

Mining Skeletal Phenotype Descriptions from Scientific Literature

Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported a...

Descripción completa

Detalles Bibliográficos
Autores principales: Groza, Tudor, Hunter, Jane, Zankl, Andreas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568099/
https://www.ncbi.nlm.nih.gov/pubmed/23409017
http://dx.doi.org/10.1371/journal.pone.0055656
_version_ 1782258767200518144
author Groza, Tudor
Hunter, Jane
Zankl, Andreas
author_facet Groza, Tudor
Hunter, Jane
Zankl, Andreas
author_sort Groza, Tudor
collection PubMed
description Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method.
format Online
Article
Text
id pubmed-3568099
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35680992013-02-13 Mining Skeletal Phenotype Descriptions from Scientific Literature Groza, Tudor Hunter, Jane Zankl, Andreas PLoS One Research Article Phenotype descriptions are important for our understanding of genetics, as they enable the computation and analysis of a varied range of issues related to the genetic and developmental bases of correlated characters. The literature contains a wealth of such phenotype descriptions, usually reported as free-text entries, similar to typical clinical summaries. In this paper, we focus on creating and making available an annotated corpus of skeletal phenotype descriptions. In addition, we present and evaluate a hybrid Machine Learning approach for mining phenotype descriptions from free text. Our hybrid approach uses an ensemble of four classifiers and experiments with several aggregation techniques. The best scoring technique achieves an F-1 score of 71.52%, which is close to the state-of-the-art in other domains, where training data exists in abundance. Finally, we discuss the influence of the features chosen for the model on the overall performance of the method. Public Library of Science 2013-02-08 /pmc/articles/PMC3568099/ /pubmed/23409017 http://dx.doi.org/10.1371/journal.pone.0055656 Text en © 2013 Groza et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Groza, Tudor
Hunter, Jane
Zankl, Andreas
Mining Skeletal Phenotype Descriptions from Scientific Literature
title Mining Skeletal Phenotype Descriptions from Scientific Literature
title_full Mining Skeletal Phenotype Descriptions from Scientific Literature
title_fullStr Mining Skeletal Phenotype Descriptions from Scientific Literature
title_full_unstemmed Mining Skeletal Phenotype Descriptions from Scientific Literature
title_short Mining Skeletal Phenotype Descriptions from Scientific Literature
title_sort mining skeletal phenotype descriptions from scientific literature
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3568099/
https://www.ncbi.nlm.nih.gov/pubmed/23409017
http://dx.doi.org/10.1371/journal.pone.0055656
work_keys_str_mv AT grozatudor miningskeletalphenotypedescriptionsfromscientificliterature
AT hunterjane miningskeletalphenotypedescriptionsfromscientificliterature
AT zanklandreas miningskeletalphenotypedescriptionsfromscientificliterature