Cargando…

ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis

Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with s...

Descripción completa

Detalles Bibliográficos
Autores principales: Kafando, Rodrique, Decoupes, Rémy, Valentin, Sarah, Sautot, Lucile, Teisseire, Maguelonne, Roche, Mathieu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8272612/
https://www.ncbi.nlm.nih.gov/pubmed/34276970
http://dx.doi.org/10.1007/s13755-021-00156-6
_version_ 1783721251543449600
author Kafando, Rodrique
Decoupes, Rémy
Valentin, Sarah
Sautot, Lucile
Teisseire, Maguelonne
Roche, Mathieu
author_facet Kafando, Rodrique
Decoupes, Rémy
Valentin, Sarah
Sautot, Lucile
Teisseire, Maguelonne
Roche, Mathieu
author_sort Kafando, Rodrique
collection PubMed
description Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with statistical measures, while the second considers morphosyntactic variation rules to extract term variants from the corpus. The combination of two term extraction and analysis strategies is the keystone of ITEXT-BIO. These include combined intra-corpus strategies that enable term extraction and analysis either from a single corpus (intra), or from corpora (inter). We assessed the two approaches, the corpus or corpora to be analysed and the type of statistical measures used. Our experimental findings revealed that the proposed methodology could be used: (1) to efficiently extract representative, discriminant and new terms from a given corpus or corpora, and (2) to provide quantitative and qualitative analyses on these terms regarding the study domain.
format Online
Article
Text
id pubmed-8272612
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-82726122021-07-12 ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis Kafando, Rodrique Decoupes, Rémy Valentin, Sarah Sautot, Lucile Teisseire, Maguelonne Roche, Mathieu Health Inf Sci Syst Research Here, we introduce ITEXT-BIO, an intelligent process for biomedical domain terminology extraction from textual documents and subsequent analysis. The proposed methodology consists of two complementary approaches, including free and driven term extraction. The first is based on term extraction with statistical measures, while the second considers morphosyntactic variation rules to extract term variants from the corpus. The combination of two term extraction and analysis strategies is the keystone of ITEXT-BIO. These include combined intra-corpus strategies that enable term extraction and analysis either from a single corpus (intra), or from corpora (inter). We assessed the two approaches, the corpus or corpora to be analysed and the type of statistical measures used. Our experimental findings revealed that the proposed methodology could be used: (1) to efficiently extract representative, discriminant and new terms from a given corpus or corpora, and (2) to provide quantitative and qualitative analyses on these terms regarding the study domain. Springer International Publishing 2021-07-10 /pmc/articles/PMC8272612/ /pubmed/34276970 http://dx.doi.org/10.1007/s13755-021-00156-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Research
Kafando, Rodrique
Decoupes, Rémy
Valentin, Sarah
Sautot, Lucile
Teisseire, Maguelonne
Roche, Mathieu
ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title_full ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title_fullStr ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title_full_unstemmed ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title_short ITEXT-BIO: Intelligent Term EXTraction for BIOmedical analysis
title_sort itext-bio: intelligent term extraction for biomedical analysis
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8272612/
https://www.ncbi.nlm.nih.gov/pubmed/34276970
http://dx.doi.org/10.1007/s13755-021-00156-6
work_keys_str_mv AT kafandorodrique itextbiointelligenttermextractionforbiomedicalanalysis
AT decoupesremy itextbiointelligenttermextractionforbiomedicalanalysis
AT valentinsarah itextbiointelligenttermextractionforbiomedicalanalysis
AT sautotlucile itextbiointelligenttermextractionforbiomedicalanalysis
AT teisseiremaguelonne itextbiointelligenttermextractionforbiomedicalanalysis
AT rochemathieu itextbiointelligenttermextractionforbiomedicalanalysis