Cargando…

Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts

BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NL...

Descripción completa

Detalles Bibliográficos
Autores principales: Cellier, Peggy, Charnois, Thierry, Plantevit, Marc, Rigotti, Christophe, Crémilleux, Bruno, Gandrillon, Olivier, Kléma, Jiří, Manguin, Jean-Luc
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4436157/
https://www.ncbi.nlm.nih.gov/pubmed/25992265
http://dx.doi.org/10.1186/s13326-015-0023-3
_version_ 1782372017792614400
author Cellier, Peggy
Charnois, Thierry
Plantevit, Marc
Rigotti, Christophe
Crémilleux, Bruno
Gandrillon, Olivier
Kléma, Jiří
Manguin, Jean-Luc
author_facet Cellier, Peggy
Charnois, Thierry
Plantevit, Marc
Rigotti, Christophe
Crémilleux, Bruno
Gandrillon, Olivier
Kléma, Jiří
Manguin, Jean-Luc
author_sort Cellier, Peggy
collection PubMed
description BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. RESULTS: We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. CONCLUSIONS: Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/. The software is available at https://bingo2.greyc.fr/?q=node/22.
format Online
Article
Text
id pubmed-4436157
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44361572015-05-20 Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts Cellier, Peggy Charnois, Thierry Plantevit, Marc Rigotti, Christophe Crémilleux, Bruno Gandrillon, Olivier Kléma, Jiří Manguin, Jean-Luc J Biomed Semantics Research Article BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. RESULTS: We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. CONCLUSIONS: Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/. The software is available at https://bingo2.greyc.fr/?q=node/22. BioMed Central 2015-05-18 /pmc/articles/PMC4436157/ /pubmed/25992265 http://dx.doi.org/10.1186/s13326-015-0023-3 Text en © Cellier et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Cellier, Peggy
Charnois, Thierry
Plantevit, Marc
Rigotti, Christophe
Crémilleux, Bruno
Gandrillon, Olivier
Kléma, Jiří
Manguin, Jean-Luc
Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title_full Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title_fullStr Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title_full_unstemmed Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title_short Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
title_sort sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4436157/
https://www.ncbi.nlm.nih.gov/pubmed/25992265
http://dx.doi.org/10.1186/s13326-015-0023-3
work_keys_str_mv AT cellierpeggy sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT charnoisthierry sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT plantevitmarc sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT rigottichristophe sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT cremilleuxbruno sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT gandrillonolivier sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT klemajiri sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts
AT manguinjeanluc sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts