Cargando…

Text Mining the History of Medicine

Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining...

Descripción completa

Detalles Bibliográficos
Autores principales:	Thompson, Paul, Batista-Navarro, Riza Theresa, Kontonatsios, Georgios, Carter, Jacob, Toon, Elizabeth, McNaught, John, Timmermann, Carsten, Worboys, Michael, Ananiadou, Sophia
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4703377/ https://www.ncbi.nlm.nih.gov/pubmed/26734936 http://dx.doi.org/10.1371/journal.pone.0144717

_version_	1782408734283137024
author	Thompson, Paul Batista-Navarro, Riza Theresa Kontonatsios, Georgios Carter, Jacob Toon, Elizabeth McNaught, John Timmermann, Carsten Worboys, Michael Ananiadou, Sophia
author_facet	Thompson, Paul Batista-Navarro, Riza Theresa Kontonatsios, Georgios Carter, Jacob Toon, Elizabeth McNaught, John Timmermann, Carsten Worboys, Michael Ananiadou, Sophia
author_sort	Thompson, Paul
collection	PubMed
description	Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19(th) century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform.
format	Online Article Text
id	pubmed-4703377
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-47033772016-01-15 Text Mining the History of Medicine Thompson, Paul Batista-Navarro, Riza Theresa Kontonatsios, Georgios Carter, Jacob Toon, Elizabeth McNaught, John Timmermann, Carsten Worboys, Michael Ananiadou, Sophia PLoS One Research Article Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19(th) century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform. Public Library of Science 2016-01-06 /pmc/articles/PMC4703377/ /pubmed/26734936 http://dx.doi.org/10.1371/journal.pone.0144717 Text en © 2016 Thompson et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
spellingShingle	Research Article Thompson, Paul Batista-Navarro, Riza Theresa Kontonatsios, Georgios Carter, Jacob Toon, Elizabeth McNaught, John Timmermann, Carsten Worboys, Michael Ananiadou, Sophia Text Mining the History of Medicine
title	Text Mining the History of Medicine
title_full	Text Mining the History of Medicine
title_fullStr	Text Mining the History of Medicine
title_full_unstemmed	Text Mining the History of Medicine
title_short	Text Mining the History of Medicine
title_sort	text mining the history of medicine
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4703377/ https://www.ncbi.nlm.nih.gov/pubmed/26734936 http://dx.doi.org/10.1371/journal.pone.0144717
work_keys_str_mv	AT thompsonpaul textminingthehistoryofmedicine AT batistanavarrorizatheresa textminingthehistoryofmedicine AT kontonatsiosgeorgios textminingthehistoryofmedicine AT carterjacob textminingthehistoryofmedicine AT toonelizabeth textminingthehistoryofmedicine AT mcnaughtjohn textminingthehistoryofmedicine AT timmermanncarsten textminingthehistoryofmedicine AT worboysmichael textminingthehistoryofmedicine AT ananiadousophia textminingthehistoryofmedicine

Text Mining the History of Medicine

Ejemplares similares