Cargando…
Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses
Zoonotic viruses represent emerging or re-emerging pathogens that pose significant public health threats throughout the world. It is therefore crucial to advance current surveillance mechanisms for these viruses through outlets such as phylogeography. Despite the abundance of zoonotic viral sequence...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333696/ https://www.ncbi.nlm.nih.gov/pubmed/25717409 |
_version_ | 1782358085045583872 |
---|---|
author | Tahsin, Tasnia Beard, Rachel Rivera, Robert Lauder, Rob Wallstrom, Garrick Scotch, Matthew Gonzalez, Graciela |
author_facet | Tahsin, Tasnia Beard, Rachel Rivera, Robert Lauder, Rob Wallstrom, Garrick Scotch, Matthew Gonzalez, Graciela |
author_sort | Tahsin, Tasnia |
collection | PubMed |
description | Zoonotic viruses represent emerging or re-emerging pathogens that pose significant public health threats throughout the world. It is therefore crucial to advance current surveillance mechanisms for these viruses through outlets such as phylogeography. Despite the abundance of zoonotic viral sequence data in publicly available databases such as GenBank, phylogeographic analysis of these viruses is often limited by the lack of adequate geographic metadata. However, many GenBank records include references to articles with more detailed information and automated systems may help extract this information efficiently and effectively. In this paper, we describe our efforts to determine the proportion of GenBank records with “insufficient” geographic metadata for seven well-studied viruses. We also evaluate the performance of four different Named Entity Recognition (NER) systems for automatically extracting related entities using a manually created gold-standard. |
format | Online Article Text |
id | pubmed-4333696 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-43336962015-02-25 Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses Tahsin, Tasnia Beard, Rachel Rivera, Robert Lauder, Rob Wallstrom, Garrick Scotch, Matthew Gonzalez, Graciela AMIA Jt Summits Transl Sci Proc Articles Zoonotic viruses represent emerging or re-emerging pathogens that pose significant public health threats throughout the world. It is therefore crucial to advance current surveillance mechanisms for these viruses through outlets such as phylogeography. Despite the abundance of zoonotic viral sequence data in publicly available databases such as GenBank, phylogeographic analysis of these viruses is often limited by the lack of adequate geographic metadata. However, many GenBank records include references to articles with more detailed information and automated systems may help extract this information efficiently and effectively. In this paper, we describe our efforts to determine the proportion of GenBank records with “insufficient” geographic metadata for seven well-studied viruses. We also evaluate the performance of four different Named Entity Recognition (NER) systems for automatically extracting related entities using a manually created gold-standard. American Medical Informatics Association 2014-04-07 /pmc/articles/PMC4333696/ /pubmed/25717409 Text en ©2014 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Tahsin, Tasnia Beard, Rachel Rivera, Robert Lauder, Rob Wallstrom, Garrick Scotch, Matthew Gonzalez, Graciela Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title | Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title_full | Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title_fullStr | Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title_full_unstemmed | Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title_short | Natural Language Processing Methods for Enhancing Geographic Metadata for Phylogeography of Zoonotic Viruses |
title_sort | natural language processing methods for enhancing geographic metadata for phylogeography of zoonotic viruses |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4333696/ https://www.ncbi.nlm.nih.gov/pubmed/25717409 |
work_keys_str_mv | AT tahsintasnia naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT beardrachel naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT riverarobert naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT lauderrob naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT wallstromgarrick naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT scotchmatthew naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses AT gonzalezgraciela naturallanguageprocessingmethodsforenhancinggeographicmetadataforphylogeographyofzoonoticviruses |