Cargando…
Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts
BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NL...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4436157/ https://www.ncbi.nlm.nih.gov/pubmed/25992265 http://dx.doi.org/10.1186/s13326-015-0023-3 |
_version_ | 1782372017792614400 |
---|---|
author | Cellier, Peggy Charnois, Thierry Plantevit, Marc Rigotti, Christophe Crémilleux, Bruno Gandrillon, Olivier Kléma, Jiří Manguin, Jean-Luc |
author_facet | Cellier, Peggy Charnois, Thierry Plantevit, Marc Rigotti, Christophe Crémilleux, Bruno Gandrillon, Olivier Kléma, Jiří Manguin, Jean-Luc |
author_sort | Cellier, Peggy |
collection | PubMed |
description | BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. RESULTS: We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. CONCLUSIONS: Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/. The software is available at https://bingo2.greyc.fr/?q=node/22. |
format | Online Article Text |
id | pubmed-4436157 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-44361572015-05-20 Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts Cellier, Peggy Charnois, Thierry Plantevit, Marc Rigotti, Christophe Crémilleux, Bruno Gandrillon, Olivier Kléma, Jiří Manguin, Jean-Luc J Biomed Semantics Research Article BACKGROUND: Discovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user. RESULTS: We take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed. CONCLUSIONS: Experiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/. The software is available at https://bingo2.greyc.fr/?q=node/22. BioMed Central 2015-05-18 /pmc/articles/PMC4436157/ /pubmed/25992265 http://dx.doi.org/10.1186/s13326-015-0023-3 Text en © Cellier et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Cellier, Peggy Charnois, Thierry Plantevit, Marc Rigotti, Christophe Crémilleux, Bruno Gandrillon, Olivier Kléma, Jiří Manguin, Jean-Luc Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title | Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title_full | Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title_fullStr | Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title_full_unstemmed | Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title_short | Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
title_sort | sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4436157/ https://www.ncbi.nlm.nih.gov/pubmed/25992265 http://dx.doi.org/10.1186/s13326-015-0023-3 |
work_keys_str_mv | AT cellierpeggy sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT charnoisthierry sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT plantevitmarc sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT rigottichristophe sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT cremilleuxbruno sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT gandrillonolivier sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT klemajiri sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts AT manguinjeanluc sequentialpatternminingfordiscoveringgeneinteractionsandtheircontextualinformationfrombiomedicaltexts |