Cargando…

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis....

Descripción completa

Detalles Bibliográficos
Autores principales: Dérozier, Sandra, Bossy, Robert, Deléger, Louise, Ba, Mouhamadou, Chaix, Estelle, Harlé, Olivier, Loux, Valentin, Falentin, Hélène, Nédellec, Claire
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858090/
https://www.ncbi.nlm.nih.gov/pubmed/36662691
http://dx.doi.org/10.1371/journal.pone.0272473
_version_ 1784874012228190208
author Dérozier, Sandra
Bossy, Robert
Deléger, Louise
Ba, Mouhamadou
Chaix, Estelle
Harlé, Olivier
Loux, Valentin
Falentin, Hélène
Nédellec, Claire
author_facet Dérozier, Sandra
Bossy, Robert
Deléger, Louise
Ba, Mouhamadou
Chaix, Estelle
Harlé, Olivier
Loux, Valentin
Falentin, Hélène
Nédellec, Claire
author_sort Dérozier, Sandra
collection PubMed
description The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community. The Omnicrobe database contains around 1 million descriptions of microbe properties. These descriptions are created by analyzing and combining six information sources of various kinds, i.e. biological resource catalogs, sequence databases and scientific literature. The microbe properties are indexed by the Ontobiotope ontology and their taxa are indexed by an extended version of the taxonomy maintained by the National Center for Biotechnology Information. The Omnicrobe application covers all domains of microbiology. With simple or rich ontology-based queries, it provides easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes, and uses of microbes. We illustrate the potential of Omnicrobe with a use case from the food innovation domain.
format Online
Article
Text
id pubmed-9858090
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98580902023-01-21 Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach Dérozier, Sandra Bossy, Robert Deléger, Louise Ba, Mouhamadou Chaix, Estelle Harlé, Olivier Loux, Valentin Falentin, Hélène Nédellec, Claire PLoS One Research Article The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in a standard ontology-based representation and normalization of the textual descriptions by semantic analysis. Recent text mining methods offer powerful ways to extract textual information and generate ontology-based representation. This paper describes the design of the Omnicrobe application that gathers comprehensive information on habitats, phenotypes, and usages of microbes from scientific sources of high interest to the microbiology community. The Omnicrobe database contains around 1 million descriptions of microbe properties. These descriptions are created by analyzing and combining six information sources of various kinds, i.e. biological resource catalogs, sequence databases and scientific literature. The microbe properties are indexed by the Ontobiotope ontology and their taxa are indexed by an extended version of the taxonomy maintained by the National Center for Biotechnology Information. The Omnicrobe application covers all domains of microbiology. With simple or rich ontology-based queries, it provides easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes, and uses of microbes. We illustrate the potential of Omnicrobe with a use case from the food innovation domain. Public Library of Science 2023-01-20 /pmc/articles/PMC9858090/ /pubmed/36662691 http://dx.doi.org/10.1371/journal.pone.0272473 Text en © 2023 Dérozier et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Dérozier, Sandra
Bossy, Robert
Deléger, Louise
Ba, Mouhamadou
Chaix, Estelle
Harlé, Olivier
Loux, Valentin
Falentin, Hélène
Nédellec, Claire
Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title_full Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title_fullStr Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title_full_unstemmed Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title_short Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
title_sort omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9858090/
https://www.ncbi.nlm.nih.gov/pubmed/36662691
http://dx.doi.org/10.1371/journal.pone.0272473
work_keys_str_mv AT deroziersandra omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT bossyrobert omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT delegerlouise omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT bamouhamadou omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT chaixestelle omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT harleolivier omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT louxvalentin omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT falentinhelene omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach
AT nedellecclaire omnicrobeanopenaccessdatabaseofmicrobialhabitatsandphenotypesusingacomprehensivetextmininganddatafusionapproach