Cargando…
Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application
We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomi...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4439553/ https://www.ncbi.nlm.nih.gov/pubmed/26052282 http://dx.doi.org/10.3389/fninf.2015.00013 |
_version_ | 1782372503204659200 |
---|---|
author | French, Leon Liu, Po Marais, Olivia Koreman, Tianna Tseng, Lucia Lai, Artemis Pavlidis, Paul |
author_facet | French, Leon Liu, Po Marais, Olivia Koreman, Tianna Tseng, Lucia Lai, Artemis Pavlidis, Paul |
author_sort | French, Leon |
collection | PubMed |
description | We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/. |
format | Online Article Text |
id | pubmed-4439553 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-44395532015-06-05 Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application French, Leon Liu, Po Marais, Olivia Koreman, Tianna Tseng, Lucia Lai, Artemis Pavlidis, Paul Front Neuroinform Neuroscience We describe the WhiteText project, and its progress towards automatically extracting statements of neuroanatomical connectivity from text. We review progress to date on the three main steps of the project: recognition of brain region mentions, standardization of brain region mentions to neuroanatomical nomenclature, and connectivity statement extraction. We further describe a new version of our manually curated corpus that adds 2,111 connectivity statements from 1,828 additional abstracts. Cross-validation classification within the new corpus replicates results on our original corpus, recalling 67% of connectivity statements at 51% precision. The resulting merged corpus provides 5,208 connectivity statements that can be used to seed species-specific connectivity matrices and to better train automated techniques. Finally, we present a new web application that allows fast interactive browsing of the over 70,000 sentences indexed by the system, as a tool for accessing the data and assisting in further curation. Software and data are freely available at http://www.chibi.ubc.ca/WhiteText/. Frontiers Media S.A. 2015-05-21 /pmc/articles/PMC4439553/ /pubmed/26052282 http://dx.doi.org/10.3389/fninf.2015.00013 Text en Copyright © 2015 French, Liu, Marais, Koreman, Tseng, Lai and Pavlidis. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution and reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience French, Leon Liu, Po Marais, Olivia Koreman, Tianna Tseng, Lucia Lai, Artemis Pavlidis, Paul Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title | Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title_full | Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title_fullStr | Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title_full_unstemmed | Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title_short | Text mining for neuroanatomy using WhiteText with an updated corpus and a new web application |
title_sort | text mining for neuroanatomy using whitetext with an updated corpus and a new web application |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4439553/ https://www.ncbi.nlm.nih.gov/pubmed/26052282 http://dx.doi.org/10.3389/fninf.2015.00013 |
work_keys_str_mv | AT frenchleon textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT liupo textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT maraisolivia textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT koremantianna textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT tsenglucia textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT laiartemis textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication AT pavlidispaul textminingforneuroanatomyusingwhitetextwithanupdatedcorpusandanewwebapplication |