Cargando…

Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays

Myelodysplastic syndromes have increased in frequency and incidence in the American population, but patient prognosis has not significantly improved over the last decade. Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identifi...

Descripción completa

Detalles Bibliográficos
Autores principales: Fan, Jing, Dy, Jennifer G., Chang, Chung-Che, Zhou, Xiaobo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Sun Yat-sen University Cancer Center 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3845573/
https://www.ncbi.nlm.nih.gov/pubmed/23327800
http://dx.doi.org/10.5732/cjc.012.10113
_version_ 1782293328093511680
author Fan, Jing
Dy, Jennifer G.
Chang, Chung-Che
Zhou, Xiaobo
author_facet Fan, Jing
Dy, Jennifer G.
Chang, Chung-Che
Zhou, Xiaobo
author_sort Fan, Jing
collection PubMed
description Myelodysplastic syndromes have increased in frequency and incidence in the American population, but patient prognosis has not significantly improved over the last decade. Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified. In this study, we propose a method that associates two state-of-the-art array technologies—single nucleotide polymorphism (SNP) array and gene expression array—with gene motifs considered transcription factor-binding sites (TFBS). We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS. The potential regulation of SNP-containing motifs affects only when certain mutations occur. These motifs can be identified from a group of co-expressed genes with copy number variation. Then, we used a sliding window to identify motif candidates near SNPs on gene sequences. The candidates were filtered by coarse thresholding and fine statistical testing. Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure, we identified 28 SNP-containing motifs as candidate TFBS. We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database. Another six motifs were validated by TRANSFAC via searching binding fragments on co-regulated genes. The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes. Thus, our proposed method, a novel strategy for associating two data categories, is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation.
format Online
Article
Text
id pubmed-3845573
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Sun Yat-sen University Cancer Center
record_format MEDLINE/PubMed
spelling pubmed-38455732013-12-11 Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays Fan, Jing Dy, Jennifer G. Chang, Chung-Che Zhou, Xiaobo Chin J Cancer Original Article Myelodysplastic syndromes have increased in frequency and incidence in the American population, but patient prognosis has not significantly improved over the last decade. Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified. In this study, we propose a method that associates two state-of-the-art array technologies—single nucleotide polymorphism (SNP) array and gene expression array—with gene motifs considered transcription factor-binding sites (TFBS). We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS. The potential regulation of SNP-containing motifs affects only when certain mutations occur. These motifs can be identified from a group of co-expressed genes with copy number variation. Then, we used a sliding window to identify motif candidates near SNPs on gene sequences. The candidates were filtered by coarse thresholding and fine statistical testing. Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure, we identified 28 SNP-containing motifs as candidate TFBS. We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database. Another six motifs were validated by TRANSFAC via searching binding fragments on co-regulated genes. The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes. Thus, our proposed method, a novel strategy for associating two data categories, is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation. Sun Yat-sen University Cancer Center 2013-04 /pmc/articles/PMC3845573/ /pubmed/23327800 http://dx.doi.org/10.5732/cjc.012.10113 Text en Chinese Journal of Cancer http://creativecommons.org/licenses/by-nc-sa/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License, which allows readers to alter, transform, or build upon the article and then distribute the resulting work under the same or similar license to this one. The work must be attributed back to the original author and commercial use is not permitted without specific permission.
spellingShingle Original Article
Fan, Jing
Dy, Jennifer G.
Chang, Chung-Che
Zhou, Xiaobo
Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title_full Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title_fullStr Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title_full_unstemmed Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title_short Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays
title_sort identification of snp-containing regulatory motifs in the myelodysplastic syndromes model using snp arrays and gene expression arrays
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3845573/
https://www.ncbi.nlm.nih.gov/pubmed/23327800
http://dx.doi.org/10.5732/cjc.012.10113
work_keys_str_mv AT fanjing identificationofsnpcontainingregulatorymotifsinthemyelodysplasticsyndromesmodelusingsnparraysandgeneexpressionarrays
AT dyjenniferg identificationofsnpcontainingregulatorymotifsinthemyelodysplasticsyndromesmodelusingsnparraysandgeneexpressionarrays
AT changchungche identificationofsnpcontainingregulatorymotifsinthemyelodysplasticsyndromesmodelusingsnparraysandgeneexpressionarrays
AT zhouxiaobo identificationofsnpcontainingregulatorymotifsinthemyelodysplasticsyndromesmodelusingsnparraysandgeneexpressionarrays