Cargando…

A weighted q-gram method for glycan structure classification

BACKGROUND: Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one anothe...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Limin, Ching, Wai-Ki, Yamaguchi, Takako, Aoki-Kinoshita, Kiyoko F
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009505/
https://www.ncbi.nlm.nih.gov/pubmed/20122206
http://dx.doi.org/10.1186/1471-2105-11-S1-S33
_version_ 1782194694057361408
author Li, Limin
Ching, Wai-Ki
Yamaguchi, Takako
Aoki-Kinoshita, Kiyoko F
author_facet Li, Limin
Ching, Wai-Ki
Yamaguchi, Takako
Aoki-Kinoshita, Kiyoko F
author_sort Li, Limin
collection PubMed
description BACKGROUND: Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, thus forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. That is, that if two structures have completely different components, then they are completely different. However, from a biological standpoint, this is not the case. In this paper, we propose a weighted q-gram method to measure the similarity among glycans by incorporating the similarity of the geometric structures, monosaccharides and glycosidic bonds among q-grams. In contrast to the traditional q-gram method, our weighted q-gram method admits similarity among q-grams for a certain q. Thus our new kernels for glycan structure were developed and then applied in SVMs to classify glycans. RESULTS: Two glycan datasets were used to compare the weighted q-gram method and the original q-gram method. The results show that the incorporation of q-gram similarity improves the classification performance for all of the important glycan classes tested. CONCLUSION: The results in this paper indicate that similarity among q-grams obtained from geometric structure, monosaccharides and glycosidic linkage contributes to the glycan function classification. This is a big step towards the understanding of glycan function based on their complex structures.
format Text
id pubmed-3009505
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30095052010-12-23 A weighted q-gram method for glycan structure classification Li, Limin Ching, Wai-Ki Yamaguchi, Takako Aoki-Kinoshita, Kiyoko F BMC Bioinformatics Research BACKGROUND: Glycobiology pertains to the study of carbohydrate sugar chains, or glycans, in a particular cell or organism. Many computational approaches have been proposed for analyzing these complex glycan structures, which are chains of monosaccharides. The monosaccharides are linked to one another by glycosidic bonds, which can take on a variety of comformations, thus forming branches and resulting in complex tree structures. The q-gram method is one of these recent methods used to understand glycan function based on the classification of their tree structures. This q-gram method assumes that for a certain q, different q-grams share no similarity among themselves. That is, that if two structures have completely different components, then they are completely different. However, from a biological standpoint, this is not the case. In this paper, we propose a weighted q-gram method to measure the similarity among glycans by incorporating the similarity of the geometric structures, monosaccharides and glycosidic bonds among q-grams. In contrast to the traditional q-gram method, our weighted q-gram method admits similarity among q-grams for a certain q. Thus our new kernels for glycan structure were developed and then applied in SVMs to classify glycans. RESULTS: Two glycan datasets were used to compare the weighted q-gram method and the original q-gram method. The results show that the incorporation of q-gram similarity improves the classification performance for all of the important glycan classes tested. CONCLUSION: The results in this paper indicate that similarity among q-grams obtained from geometric structure, monosaccharides and glycosidic linkage contributes to the glycan function classification. This is a big step towards the understanding of glycan function based on their complex structures. BioMed Central 2010-01-18 /pmc/articles/PMC3009505/ /pubmed/20122206 http://dx.doi.org/10.1186/1471-2105-11-S1-S33 Text en Copyright ©2010 Li et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Li, Limin
Ching, Wai-Ki
Yamaguchi, Takako
Aoki-Kinoshita, Kiyoko F
A weighted q-gram method for glycan structure classification
title A weighted q-gram method for glycan structure classification
title_full A weighted q-gram method for glycan structure classification
title_fullStr A weighted q-gram method for glycan structure classification
title_full_unstemmed A weighted q-gram method for glycan structure classification
title_short A weighted q-gram method for glycan structure classification
title_sort weighted q-gram method for glycan structure classification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3009505/
https://www.ncbi.nlm.nih.gov/pubmed/20122206
http://dx.doi.org/10.1186/1471-2105-11-S1-S33
work_keys_str_mv AT lilimin aweightedqgrammethodforglycanstructureclassification
AT chingwaiki aweightedqgrammethodforglycanstructureclassification
AT yamaguchitakako aweightedqgrammethodforglycanstructureclassification
AT aokikinoshitakiyokof aweightedqgrammethodforglycanstructureclassification
AT lilimin weightedqgrammethodforglycanstructureclassification
AT chingwaiki weightedqgrammethodforglycanstructureclassification
AT yamaguchitakako weightedqgrammethodforglycanstructureclassification
AT aokikinoshitakiyokof weightedqgrammethodforglycanstructureclassification