Cargando…

Textrous!: Extracting Semantic Textual Meaning from Gene Sets

The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant inform...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Hongyu, Martin, Bronwen, Daimon, Caitlin M., Siddiqui, Sana, Luttrell, Louis M., Maudsley, Stuart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639949/
https://www.ncbi.nlm.nih.gov/pubmed/23646135
http://dx.doi.org/10.1371/journal.pone.0062665
_version_ 1782476025430540288
author Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Siddiqui, Sana
Luttrell, Louis M.
Maudsley, Stuart
author_facet Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Siddiqui, Sana
Luttrell, Louis M.
Maudsley, Stuart
author_sort Chen, Hongyu
collection PubMed
description The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful words or sentences is crucial. Unfortunately, existing software for deriving meaningful and easily-appreciable scientific textual ‘tokens’ from large gene sets either rely on controlled vocabularies (Medical Subject Headings, Gene Ontology, BioCarta) or employ Boolean text searching and co-occurrence models that are incapable of detecting indirect links in the literature. As an improvement to existing web-based informatic tools, we have developed Textrous!, a web-based framework for the extraction of biomedical semantic meaning from a given input gene set of arbitrary length. Textrous! employs natural language processing techniques, including latent semantic indexing (LSI), sentence splitting, word tokenization, parts-of-speech tagging, and noun-phrase chunking, to mine MEDLINE abstracts, PubMed Central articles, articles from the Online Mendelian Inheritance in Man (OMIM), and Mammalian Phenotype annotation obtained from Jackson Laboratories. Textrous! has the ability to generate meaningful output data with even very small input datasets, using two different text extraction methodologies (collective and individual) for the selecting, ranking, clustering, and visualization of English words obtained from the user data. Textrous!, therefore, is able to facilitate the output of quantitatively significant and easily appreciable semantic words and phrases linked to both individual gene and batch genomic data.
format Online
Article
Text
id pubmed-3639949
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36399492013-05-03 Textrous!: Extracting Semantic Textual Meaning from Gene Sets Chen, Hongyu Martin, Bronwen Daimon, Caitlin M. Siddiqui, Sana Luttrell, Louis M. Maudsley, Stuart PLoS One Research Article The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful words or sentences is crucial. Unfortunately, existing software for deriving meaningful and easily-appreciable scientific textual ‘tokens’ from large gene sets either rely on controlled vocabularies (Medical Subject Headings, Gene Ontology, BioCarta) or employ Boolean text searching and co-occurrence models that are incapable of detecting indirect links in the literature. As an improvement to existing web-based informatic tools, we have developed Textrous!, a web-based framework for the extraction of biomedical semantic meaning from a given input gene set of arbitrary length. Textrous! employs natural language processing techniques, including latent semantic indexing (LSI), sentence splitting, word tokenization, parts-of-speech tagging, and noun-phrase chunking, to mine MEDLINE abstracts, PubMed Central articles, articles from the Online Mendelian Inheritance in Man (OMIM), and Mammalian Phenotype annotation obtained from Jackson Laboratories. Textrous! has the ability to generate meaningful output data with even very small input datasets, using two different text extraction methodologies (collective and individual) for the selecting, ranking, clustering, and visualization of English words obtained from the user data. Textrous!, therefore, is able to facilitate the output of quantitatively significant and easily appreciable semantic words and phrases linked to both individual gene and batch genomic data. Public Library of Science 2013-04-30 /pmc/articles/PMC3639949/ /pubmed/23646135 http://dx.doi.org/10.1371/journal.pone.0062665 Text en https://creativecommons.org/publicdomain/zero/1.0/ This is an open-access article distributed under the terms of the Creative Commons Public Domain declaration, which stipulates that, once placed in the public domain, this work may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose.
spellingShingle Research Article
Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Siddiqui, Sana
Luttrell, Louis M.
Maudsley, Stuart
Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title_full Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title_fullStr Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title_full_unstemmed Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title_short Textrous!: Extracting Semantic Textual Meaning from Gene Sets
title_sort textrous!: extracting semantic textual meaning from gene sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3639949/
https://www.ncbi.nlm.nih.gov/pubmed/23646135
http://dx.doi.org/10.1371/journal.pone.0062665
work_keys_str_mv AT chenhongyu textrousextractingsemantictextualmeaningfromgenesets
AT martinbronwen textrousextractingsemantictextualmeaningfromgenesets
AT daimoncaitlinm textrousextractingsemantictextualmeaningfromgenesets
AT siddiquisana textrousextractingsemantictextualmeaningfromgenesets
AT luttrelllouism textrousextractingsemantictextualmeaningfromgenesets
AT maudsleystuart textrousextractingsemantictextualmeaningfromgenesets