Cargando…

A structural SVM approach for reference parsing

BACKGROUND: Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extra...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Xiaoli, Zou, Jie, Le, Daniel X, Thoma, George R
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111593/ https://www.ncbi.nlm.nih.gov/pubmed/21658294 http://dx.doi.org/10.1186/1471-2105-12-S3-S7

_version_	1782205650676219904
author	Zhang, Xiaoli Zou, Jie Le, Daniel X Thoma, George R
author_facet	Zhang, Xiaoli Zou, Jie Le, Daniel X Thoma, George R
author_sort	Zhang, Xiaoli
collection	PubMed
description	BACKGROUND: Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. RESULTS: In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. CONCLUSIONS: When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing.
format	Online Article Text
id	pubmed-3111593
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-31115932011-06-11 A structural SVM approach for reference parsing Zhang, Xiaoli Zou, Jie Le, Daniel X Thoma, George R BMC Bioinformatics Research BACKGROUND: Automated extraction of bibliographic data, such as article titles, author names, abstracts, and references is essential to the affordable creation of large citation databases. References, typically appearing at the end of journal articles, can also provide valuable information for extracting other bibliographic data. Therefore, parsing individual reference to extract author, title, journal, year, etc. is sometimes a necessary preprocessing step in building citation-indexing systems. The regular structure in references enables us to consider reference parsing a sequence learning problem and to study structural Support Vector Machine (structural SVM), a newly developed structured learning algorithm on parsing references. RESULTS: In this study, we implemented structural SVM and used two types of contextual features to compare structural SVM with conventional SVM. Both methods achieve above 98% token classification accuracy and above 95% overall chunk-level accuracy for reference parsing. We also compared SVM and structural SVM to Conditional Random Field (CRF). The experimental results show that structural SVM and CRF achieve similar accuracies at token- and chunk-levels. CONCLUSIONS: When only basic observation features are used for each token, structural SVM achieves higher performance compared to SVM since it utilizes the contextual label features. However, when the contextual observation features from neighboring tokens are combined, SVM performance improves greatly, and is close to that of structural SVM after adding the second order contextual observation features. The comparison of these two methods with CRF using the same set of binary features show that both structural SVM and CRF perform better than SVM, indicating their stronger sequence learning ability in reference parsing. BioMed Central 2011-06-09 /pmc/articles/PMC3111593/ /pubmed/21658294 http://dx.doi.org/10.1186/1471-2105-12-S3-S7 Text en This article is in the public domain. This article is in the public domain.
spellingShingle	Research Zhang, Xiaoli Zou, Jie Le, Daniel X Thoma, George R A structural SVM approach for reference parsing
title	A structural SVM approach for reference parsing
title_full	A structural SVM approach for reference parsing
title_fullStr	A structural SVM approach for reference parsing
title_full_unstemmed	A structural SVM approach for reference parsing
title_short	A structural SVM approach for reference parsing
title_sort	structural svm approach for reference parsing
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3111593/ https://www.ncbi.nlm.nih.gov/pubmed/21658294 http://dx.doi.org/10.1186/1471-2105-12-S3-S7
work_keys_str_mv	AT zhangxiaoli astructuralsvmapproachforreferenceparsing AT zoujie astructuralsvmapproachforreferenceparsing AT ledanielx astructuralsvmapproachforreferenceparsing AT thomageorger astructuralsvmapproachforreferenceparsing AT zhangxiaoli structuralsvmapproachforreferenceparsing AT zoujie structuralsvmapproachforreferenceparsing AT ledanielx structuralsvmapproachforreferenceparsing AT thomageorger structuralsvmapproachforreferenceparsing

A structural SVM approach for reference parsing

Ejemplares similares