Cargando…

iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system

This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rel...

Descripción completa

Detalles Bibliográficos
Autores principales: Peng, Yifan, Tudor, Catalina O., Torii, Manabu, Wu, Cathy H., Vijay-Shanker, K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028706/
https://www.ncbi.nlm.nih.gov/pubmed/24850848
http://dx.doi.org/10.1093/database/bau038
_version_ 1782317095050018816
author Peng, Yifan
Tudor, Catalina O.
Torii, Manabu
Wu, Cathy H.
Vijay-Shanker, K.
author_facet Peng, Yifan
Tudor, Catalina O.
Torii, Manabu
Wu, Cathy H.
Vijay-Shanker, K.
author_sort Peng, Yifan
collection PubMed
description This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rely on the analysis of sentence structures. By adopting the BioC format, we aim to make iSimp readily interoperable with other applications in the biomedical domain. To examine the utility of iSimp in BioC, we implemented a rule-based relation extraction system that uses iSimp as a preprocessing module and BioC for data exchange. Evaluation on the training corpus of BioNLP-ST 2011 GENIA Event Extraction (GE) task showed that iSimp sentence simplification improved the recall by 3.2% without reducing precision. The iSimp simplification-annotated corpora, both our previously used corpus and the GE corpus in the current study, have been converted into the BioC format and made publicly available at the project’s Web site: http://research.bioinformatics.udel.edu/isimp/. Database URL:http://research.bioinformatics.udel.edu/isimp/
format Online
Article
Text
id pubmed-4028706
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-40287062014-05-22 iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system Peng, Yifan Tudor, Catalina O. Torii, Manabu Wu, Cathy H. Vijay-Shanker, K. Database (Oxford) Original Article This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rely on the analysis of sentence structures. By adopting the BioC format, we aim to make iSimp readily interoperable with other applications in the biomedical domain. To examine the utility of iSimp in BioC, we implemented a rule-based relation extraction system that uses iSimp as a preprocessing module and BioC for data exchange. Evaluation on the training corpus of BioNLP-ST 2011 GENIA Event Extraction (GE) task showed that iSimp sentence simplification improved the recall by 3.2% without reducing precision. The iSimp simplification-annotated corpora, both our previously used corpus and the GE corpus in the current study, have been converted into the BioC format and made publicly available at the project’s Web site: http://research.bioinformatics.udel.edu/isimp/. Database URL:http://research.bioinformatics.udel.edu/isimp/ Oxford University Press 2014-05-21 /pmc/articles/PMC4028706/ /pubmed/24850848 http://dx.doi.org/10.1093/database/bau038 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Peng, Yifan
Tudor, Catalina O.
Torii, Manabu
Wu, Cathy H.
Vijay-Shanker, K.
iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title_full iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title_fullStr iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title_full_unstemmed iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title_short iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
title_sort isimp in bioc standard format: enhancing the interoperability of a sentence simplification system
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028706/
https://www.ncbi.nlm.nih.gov/pubmed/24850848
http://dx.doi.org/10.1093/database/bau038
work_keys_str_mv AT pengyifan isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem
AT tudorcatalinao isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem
AT toriimanabu isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem
AT wucathyh isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem
AT vijayshankerk isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem