Cargando…

Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers

BACKGROUND: As a marker of Helicobacter pylori, Cytotoxin-associated gene A (cagA) has been revealed to be the major virulence factor causing gastroduodenal diseases. However, the molecular mechanisms that underlie the development of different gastroduodenal diseases caused by cagA-positive H. pylor...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Chao, Xu, Shunfu, Xu, Dong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3352932/
https://www.ncbi.nlm.nih.gov/pubmed/22615823
http://dx.doi.org/10.1371/journal.pone.0036844
_version_ 1782232982622306304
author Zhang, Chao
Xu, Shunfu
Xu, Dong
author_facet Zhang, Chao
Xu, Shunfu
Xu, Dong
author_sort Zhang, Chao
collection PubMed
description BACKGROUND: As a marker of Helicobacter pylori, Cytotoxin-associated gene A (cagA) has been revealed to be the major virulence factor causing gastroduodenal diseases. However, the molecular mechanisms that underlie the development of different gastroduodenal diseases caused by cagA-positive H. pylori infection remain unknown. Current studies are limited to the evaluation of the correlation between diseases and the number of Glu-Pro-Ile-Tyr-Ala (EPIYA) motifs in the CagA strain. To further understand the relationship between CagA sequence and its virulence to gastric cancer, we proposed a systematic entropy-based approach to identify the cancer-related residues in the intervening regions of CagA and employed a supervised machine learning method for cancer and non-cancer cases classification. METHODOLOGY: An entropy-based calculation was used to detect key residues of CagA intervening sequences as the gastric cancer biomarker. For each residue, both combinatorial entropy and background entropy were calculated, and the entropy difference was used as the criterion for feature residue selection. The feature values were then fed into Support Vector Machines (SVM) with the Radial Basis Function (RBF) kernel, and two parameters were tuned to obtain the optimal F value by using grid search. Two other popular sequence classification methods, the BLAST and HMMER, were also applied to the same data for comparison. CONCLUSION: Our method achieved 76% and 71% classification accuracy for Western and East Asian subtypes, respectively, which performed significantly better than BLAST and HMMER. This research indicates that small variations of amino acids in those important residues might lead to the virulence variance of CagA strains resulting in different gastroduodenal diseases. This study provides not only a useful tool to predict the correlation between the novel CagA strain and diseases, but also a general new framework for detecting biological sequence biomarkers in population studies.
format Online
Article
Text
id pubmed-3352932
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33529322012-05-21 Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers Zhang, Chao Xu, Shunfu Xu, Dong PLoS One Research Article BACKGROUND: As a marker of Helicobacter pylori, Cytotoxin-associated gene A (cagA) has been revealed to be the major virulence factor causing gastroduodenal diseases. However, the molecular mechanisms that underlie the development of different gastroduodenal diseases caused by cagA-positive H. pylori infection remain unknown. Current studies are limited to the evaluation of the correlation between diseases and the number of Glu-Pro-Ile-Tyr-Ala (EPIYA) motifs in the CagA strain. To further understand the relationship between CagA sequence and its virulence to gastric cancer, we proposed a systematic entropy-based approach to identify the cancer-related residues in the intervening regions of CagA and employed a supervised machine learning method for cancer and non-cancer cases classification. METHODOLOGY: An entropy-based calculation was used to detect key residues of CagA intervening sequences as the gastric cancer biomarker. For each residue, both combinatorial entropy and background entropy were calculated, and the entropy difference was used as the criterion for feature residue selection. The feature values were then fed into Support Vector Machines (SVM) with the Radial Basis Function (RBF) kernel, and two parameters were tuned to obtain the optimal F value by using grid search. Two other popular sequence classification methods, the BLAST and HMMER, were also applied to the same data for comparison. CONCLUSION: Our method achieved 76% and 71% classification accuracy for Western and East Asian subtypes, respectively, which performed significantly better than BLAST and HMMER. This research indicates that small variations of amino acids in those important residues might lead to the virulence variance of CagA strains resulting in different gastroduodenal diseases. This study provides not only a useful tool to predict the correlation between the novel CagA strain and diseases, but also a general new framework for detecting biological sequence biomarkers in population studies. Public Library of Science 2012-05-15 /pmc/articles/PMC3352932/ /pubmed/22615823 http://dx.doi.org/10.1371/journal.pone.0036844 Text en Zhang et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Zhang, Chao
Xu, Shunfu
Xu, Dong
Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title_full Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title_fullStr Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title_full_unstemmed Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title_short Risk Assessment of Gastric Cancer Caused by Helicobacter pylori Using CagA Sequence Markers
title_sort risk assessment of gastric cancer caused by helicobacter pylori using caga sequence markers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3352932/
https://www.ncbi.nlm.nih.gov/pubmed/22615823
http://dx.doi.org/10.1371/journal.pone.0036844
work_keys_str_mv AT zhangchao riskassessmentofgastriccancercausedbyhelicobacterpyloriusingcagasequencemarkers
AT xushunfu riskassessmentofgastriccancercausedbyhelicobacterpyloriusingcagasequencemarkers
AT xudong riskassessmentofgastriccancercausedbyhelicobacterpyloriusingcagasequencemarkers