Cargando…

A text mining approach to detect mentions of protein glycosylation in biomedical text

Protein Glycosylation is an important post translational event that plays a pivotal role in protein folding and protein is trafficking. We describe a dictionary based and a rule based approach to mine ‘mentions‘ of protein glycosylation in text. The dictionary based approach relies on a set of manua...

Descripción completa

Detalles Bibliográficos
Autores principales: Shukla, Daksha, Jayaraman, Valadi K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Biomedical Informatics 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3449393/
https://www.ncbi.nlm.nih.gov/pubmed/23055626
http://dx.doi.org/10.6026/97320630008758
_version_ 1782244344073289728
author Shukla, Daksha
Jayaraman, Valadi K
author_facet Shukla, Daksha
Jayaraman, Valadi K
author_sort Shukla, Daksha
collection PubMed
description Protein Glycosylation is an important post translational event that plays a pivotal role in protein folding and protein is trafficking. We describe a dictionary based and a rule based approach to mine ‘mentions‘ of protein glycosylation in text. The dictionary based approach relies on a set of manually curated dictionaries specially constructed to address this task. Abstracts are then screened for the ‘mentions‘ of words from these dictionaries which are further scored followed by classification on the basis of a threshold. The rule based approaches also relies on the words in the dictionary to arrive at the features which are used for classification. The performance of the system using both the approaches has been evaluated using a manually curated corpus of 3133 abstracts. The evaluation suggests that the performance of the Rule based approach supersedes that of the Dictionary based approach.
format Online
Article
Text
id pubmed-3449393
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Biomedical Informatics
record_format MEDLINE/PubMed
spelling pubmed-34493932012-10-09 A text mining approach to detect mentions of protein glycosylation in biomedical text Shukla, Daksha Jayaraman, Valadi K Bioinformation Hypothesis Protein Glycosylation is an important post translational event that plays a pivotal role in protein folding and protein is trafficking. We describe a dictionary based and a rule based approach to mine ‘mentions‘ of protein glycosylation in text. The dictionary based approach relies on a set of manually curated dictionaries specially constructed to address this task. Abstracts are then screened for the ‘mentions‘ of words from these dictionaries which are further scored followed by classification on the basis of a threshold. The rule based approaches also relies on the words in the dictionary to arrive at the features which are used for classification. The performance of the system using both the approaches has been evaluated using a manually curated corpus of 3133 abstracts. The evaluation suggests that the performance of the Rule based approach supersedes that of the Dictionary based approach. Biomedical Informatics 2012-08-24 /pmc/articles/PMC3449393/ /pubmed/23055626 http://dx.doi.org/10.6026/97320630008758 Text en © 2012 Biomedical Informatics This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Hypothesis
Shukla, Daksha
Jayaraman, Valadi K
A text mining approach to detect mentions of protein glycosylation in biomedical text
title A text mining approach to detect mentions of protein glycosylation in biomedical text
title_full A text mining approach to detect mentions of protein glycosylation in biomedical text
title_fullStr A text mining approach to detect mentions of protein glycosylation in biomedical text
title_full_unstemmed A text mining approach to detect mentions of protein glycosylation in biomedical text
title_short A text mining approach to detect mentions of protein glycosylation in biomedical text
title_sort text mining approach to detect mentions of protein glycosylation in biomedical text
topic Hypothesis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3449393/
https://www.ncbi.nlm.nih.gov/pubmed/23055626
http://dx.doi.org/10.6026/97320630008758
work_keys_str_mv AT shukladaksha atextminingapproachtodetectmentionsofproteinglycosylationinbiomedicaltext
AT jayaramanvaladik atextminingapproachtodetectmentionsofproteinglycosylationinbiomedicaltext
AT shukladaksha textminingapproachtodetectmentionsofproteinglycosylationinbiomedicaltext
AT jayaramanvaladik textminingapproachtodetectmentionsofproteinglycosylationinbiomedicaltext