Cargando…

Machine Learning for Biomedical Literature Triage

This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three differe...

Descripción completa

Detalles Bibliográficos
Autores principales: Almeida, Hayda, Meurs, Marie-Jean, Kosseim, Leila, Butler, Greg, Tsang, Adrian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4281078/
https://www.ncbi.nlm.nih.gov/pubmed/25551575
http://dx.doi.org/10.1371/journal.pone.0115892
_version_ 1782350935147675648
author Almeida, Hayda
Meurs, Marie-Jean
Kosseim, Leila
Butler, Greg
Tsang, Adrian
author_facet Almeida, Hayda
Meurs, Marie-Jean
Kosseim, Leila
Butler, Greg
Tsang, Adrian
author_sort Almeida, Hayda
collection PubMed
description This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm.
format Online
Article
Text
id pubmed-4281078
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42810782015-01-07 Machine Learning for Biomedical Literature Triage Almeida, Hayda Meurs, Marie-Jean Kosseim, Leila Butler, Greg Tsang, Adrian PLoS One Research Article This paper presents a machine learning system for supporting the first task of the biological literature manual curation process, called triage. We compare the performance of various classification models, by experimenting with dataset sampling factors and a set of features, as well as three different machine learning algorithms (Naive Bayes, Support Vector Machine and Logistic Model Trees). The results show that the most fitting model to handle the imbalanced datasets of the triage classification task is obtained by using domain relevant features, an under-sampling technique, and the Logistic Model Trees algorithm. Public Library of Science 2014-12-31 /pmc/articles/PMC4281078/ /pubmed/25551575 http://dx.doi.org/10.1371/journal.pone.0115892 Text en © 2014 Almeida et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Almeida, Hayda
Meurs, Marie-Jean
Kosseim, Leila
Butler, Greg
Tsang, Adrian
Machine Learning for Biomedical Literature Triage
title Machine Learning for Biomedical Literature Triage
title_full Machine Learning for Biomedical Literature Triage
title_fullStr Machine Learning for Biomedical Literature Triage
title_full_unstemmed Machine Learning for Biomedical Literature Triage
title_short Machine Learning for Biomedical Literature Triage
title_sort machine learning for biomedical literature triage
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4281078/
https://www.ncbi.nlm.nih.gov/pubmed/25551575
http://dx.doi.org/10.1371/journal.pone.0115892
work_keys_str_mv AT almeidahayda machinelearningforbiomedicalliteraturetriage
AT meursmariejean machinelearningforbiomedicalliteraturetriage
AT kosseimleila machinelearningforbiomedicalliteraturetriage
AT butlergreg machinelearningforbiomedicalliteraturetriage
AT tsangadrian machinelearningforbiomedicalliteraturetriage