Cargando…

Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications

Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expa...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Hongyu, Martin, Bronwen, Daimon, Caitlin M., Maudsley, Stuart
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3558626/
https://www.ncbi.nlm.nih.gov/pubmed/23386833
http://dx.doi.org/10.3389/fphys.2013.00008
_version_ 1782257456896802816
author Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Maudsley, Stuart
author_facet Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Maudsley, Stuart
author_sort Chen, Hongyu
collection PubMed
description Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.
format Online
Article
Text
id pubmed-3558626
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-35586262013-02-05 Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications Chen, Hongyu Martin, Bronwen Daimon, Caitlin M. Maudsley, Stuart Front Physiol Physiology Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data. Frontiers Media S.A. 2013-01-30 /pmc/articles/PMC3558626/ /pubmed/23386833 http://dx.doi.org/10.3389/fphys.2013.00008 Text en Copyright © 2013 Chen, Martin, Daimon and Maudsley. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.
spellingShingle Physiology
Chen, Hongyu
Martin, Bronwen
Daimon, Caitlin M.
Maudsley, Stuart
Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title_full Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title_fullStr Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title_full_unstemmed Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title_short Effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
title_sort effective use of latent semantic indexing and computational linguistics in biological and biomedical applications
topic Physiology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3558626/
https://www.ncbi.nlm.nih.gov/pubmed/23386833
http://dx.doi.org/10.3389/fphys.2013.00008
work_keys_str_mv AT chenhongyu effectiveuseoflatentsemanticindexingandcomputationallinguisticsinbiologicalandbiomedicalapplications
AT martinbronwen effectiveuseoflatentsemanticindexingandcomputationallinguisticsinbiologicalandbiomedicalapplications
AT daimoncaitlinm effectiveuseoflatentsemanticindexingandcomputationallinguisticsinbiologicalandbiomedicalapplications
AT maudsleystuart effectiveuseoflatentsemanticindexingandcomputationallinguisticsinbiologicalandbiomedicalapplications