Cargando…

Automatically Classifying Sentences in Full-Text Biomedical Articles into Introduction, Methods, Results and Discussion

Biomedical texts can be typically represented by four rhetorical categories: introduction, methods, results and discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied approaches to automatically classify sentences...

Descripción completa

Detalles Bibliográficos
Autores principales: Agarwal, Shashank, Yu, Hong
Formato: Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041564/
https://www.ncbi.nlm.nih.gov/pubmed/21347163
Descripción
Sumario:Biomedical texts can be typically represented by four rhetorical categories: introduction, methods, results and discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied approaches to automatically classify sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We explored different approaches to automatically classify a sentence in a full-text biomedical article into the IMRAD categories. Our best system is a support vector machine classifier that achieved 81.30% accuracy, which is significantly higher than baseline systems.