Cargando…
iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system
This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rel...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028706/ https://www.ncbi.nlm.nih.gov/pubmed/24850848 http://dx.doi.org/10.1093/database/bau038 |
_version_ | 1782317095050018816 |
---|---|
author | Peng, Yifan Tudor, Catalina O. Torii, Manabu Wu, Cathy H. Vijay-Shanker, K. |
author_facet | Peng, Yifan Tudor, Catalina O. Torii, Manabu Wu, Cathy H. Vijay-Shanker, K. |
author_sort | Peng, Yifan |
collection | PubMed |
description | This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rely on the analysis of sentence structures. By adopting the BioC format, we aim to make iSimp readily interoperable with other applications in the biomedical domain. To examine the utility of iSimp in BioC, we implemented a rule-based relation extraction system that uses iSimp as a preprocessing module and BioC for data exchange. Evaluation on the training corpus of BioNLP-ST 2011 GENIA Event Extraction (GE) task showed that iSimp sentence simplification improved the recall by 3.2% without reducing precision. The iSimp simplification-annotated corpora, both our previously used corpus and the GE corpus in the current study, have been converted into the BioC format and made publicly available at the project’s Web site: http://research.bioinformatics.udel.edu/isimp/. Database URL:http://research.bioinformatics.udel.edu/isimp/ |
format | Online Article Text |
id | pubmed-4028706 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-40287062014-05-22 iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system Peng, Yifan Tudor, Catalina O. Torii, Manabu Wu, Cathy H. Vijay-Shanker, K. Database (Oxford) Original Article This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rely on the analysis of sentence structures. By adopting the BioC format, we aim to make iSimp readily interoperable with other applications in the biomedical domain. To examine the utility of iSimp in BioC, we implemented a rule-based relation extraction system that uses iSimp as a preprocessing module and BioC for data exchange. Evaluation on the training corpus of BioNLP-ST 2011 GENIA Event Extraction (GE) task showed that iSimp sentence simplification improved the recall by 3.2% without reducing precision. The iSimp simplification-annotated corpora, both our previously used corpus and the GE corpus in the current study, have been converted into the BioC format and made publicly available at the project’s Web site: http://research.bioinformatics.udel.edu/isimp/. Database URL:http://research.bioinformatics.udel.edu/isimp/ Oxford University Press 2014-05-21 /pmc/articles/PMC4028706/ /pubmed/24850848 http://dx.doi.org/10.1093/database/bau038 Text en © The Author(s) 2014. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Article Peng, Yifan Tudor, Catalina O. Torii, Manabu Wu, Cathy H. Vijay-Shanker, K. iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title | iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title_full | iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title_fullStr | iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title_full_unstemmed | iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title_short | iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system |
title_sort | isimp in bioc standard format: enhancing the interoperability of a sentence simplification system |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028706/ https://www.ncbi.nlm.nih.gov/pubmed/24850848 http://dx.doi.org/10.1093/database/bau038 |
work_keys_str_mv | AT pengyifan isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem AT tudorcatalinao isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem AT toriimanabu isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem AT wucathyh isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem AT vijayshankerk isimpinbiocstandardformatenhancingtheinteroperabilityofasentencesimplificationsystem |