Cargando…

Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data

DNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established st...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xin, Zhang, Qingrun, Chekouo, Thierry
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8313381/
https://www.ncbi.nlm.nih.gov/pubmed/34322159
http://dx.doi.org/10.3389/fgene.2021.705708
_version_ 1783729335066165248
author Chen, Xin
Zhang, Qingrun
Chekouo, Thierry
author_facet Chen, Xin
Zhang, Qingrun
Chekouo, Thierry
author_sort Chen, Xin
collection PubMed
description DNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established statistical models are not scalable to so many features; second, multiple-test and overfitting become serious. To this end, a method to quickly filter candidate sites to narrow down targets for downstream analyses is urgently needed. BACkPAy is a pre-screening Bayesian approach to detect biological meaningful patterns of potential differential methylation levels with small sample size. BACkPAy prioritizes potentially important biomarkers by the Bayesian false discovery rate (FDR) approach. It filters non-informative sites (i.e., non-differential) with flat methylation pattern levels across experimental conditions. In this work, we applied BACkPAy to a genome-wide methylation dataset with three tissue types and each type contains three gastric cancer samples. We also applied LIMMA (Linear Models for Microarray and RNA-Seq Data) to compare its results with what we achieved by BACkPAy. Then, Cox proportional hazards regression models were utilized to visualize prognostics significant markers with The Cancer Genome Atlas (TCGA) data for survival analysis. Using BACkPAy, we identified eight biological meaningful patterns/groups of differential probes from the DNA methylation dataset. Using TCGA data, we also identified five prognostic genes (i.e., predictive to the progression of gastric cancer) that contain some differential methylation probes, whereas no significant results was identified using the Benjamin-Hochberg FDR in LIMMA. We showed the importance of using BACkPAy for the analysis of DNA methylation data with extremely small sample size in gastric cancer. We revealed that RDH13, CLDN11, TMTC1, UCHL1, and FOXP2 can serve as predictive biomarkers for gastric cancer treatment and the promoter methylation level of these five genes in serum could have prognostic and diagnostic functions in gastric cancer patients.
format Online
Article
Text
id pubmed-8313381
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-83133812021-07-27 Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data Chen, Xin Zhang, Qingrun Chekouo, Thierry Front Genet Genetics DNA methylations in critical regions are highly involved in cancer pathogenesis and drug response. However, to identify causal methylations out of a large number of potential polymorphic DNA methylation sites is challenging. This high-dimensional data brings two obstacles: first, many established statistical models are not scalable to so many features; second, multiple-test and overfitting become serious. To this end, a method to quickly filter candidate sites to narrow down targets for downstream analyses is urgently needed. BACkPAy is a pre-screening Bayesian approach to detect biological meaningful patterns of potential differential methylation levels with small sample size. BACkPAy prioritizes potentially important biomarkers by the Bayesian false discovery rate (FDR) approach. It filters non-informative sites (i.e., non-differential) with flat methylation pattern levels across experimental conditions. In this work, we applied BACkPAy to a genome-wide methylation dataset with three tissue types and each type contains three gastric cancer samples. We also applied LIMMA (Linear Models for Microarray and RNA-Seq Data) to compare its results with what we achieved by BACkPAy. Then, Cox proportional hazards regression models were utilized to visualize prognostics significant markers with The Cancer Genome Atlas (TCGA) data for survival analysis. Using BACkPAy, we identified eight biological meaningful patterns/groups of differential probes from the DNA methylation dataset. Using TCGA data, we also identified five prognostic genes (i.e., predictive to the progression of gastric cancer) that contain some differential methylation probes, whereas no significant results was identified using the Benjamin-Hochberg FDR in LIMMA. We showed the importance of using BACkPAy for the analysis of DNA methylation data with extremely small sample size in gastric cancer. We revealed that RDH13, CLDN11, TMTC1, UCHL1, and FOXP2 can serve as predictive biomarkers for gastric cancer treatment and the promoter methylation level of these five genes in serum could have prognostic and diagnostic functions in gastric cancer patients. Frontiers Media S.A. 2021-07-12 /pmc/articles/PMC8313381/ /pubmed/34322159 http://dx.doi.org/10.3389/fgene.2021.705708 Text en Copyright © 2021 Chen, Zhang and Chekouo. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Chen, Xin
Zhang, Qingrun
Chekouo, Thierry
Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title_full Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title_fullStr Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title_full_unstemmed Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title_short Filtering High-Dimensional Methylation Marks With Extremely Small Sample Size: An Application to Gastric Cancer Data
title_sort filtering high-dimensional methylation marks with extremely small sample size: an application to gastric cancer data
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8313381/
https://www.ncbi.nlm.nih.gov/pubmed/34322159
http://dx.doi.org/10.3389/fgene.2021.705708
work_keys_str_mv AT chenxin filteringhighdimensionalmethylationmarkswithextremelysmallsamplesizeanapplicationtogastriccancerdata
AT zhangqingrun filteringhighdimensionalmethylationmarkswithextremelysmallsamplesizeanapplicationtogastriccancerdata
AT chekouothierry filteringhighdimensionalmethylationmarkswithextremelysmallsamplesizeanapplicationtogastriccancerdata