Cargando…
MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central signific...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6509868/ https://www.ncbi.nlm.nih.gov/pubmed/31074373 http://dx.doi.org/10.1186/s12859-019-2735-3 |
_version_ | 1783417335916265472 |
---|---|
author | Hu, Jialu Wang, Jingru Lin, Jianan Liu, Tianwei Zhong, Yuanke Liu, Jie Zheng, Yan Gao, Yiqun He, Junhao Shang, Xuequn |
author_facet | Hu, Jialu Wang, Jingru Lin, Jianan Liu, Tianwei Zhong, Yuanke Liu, Jie Zheng, Yan Gao, Yiqun He, Junhao Shang, Xuequn |
author_sort | Hu, Jialu |
collection | PubMed |
description | BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors. |
format | Online Article Text |
id | pubmed-6509868 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-65098682019-06-05 MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites Hu, Jialu Wang, Jingru Lin, Jianan Liu, Tianwei Zhong, Yuanke Liu, Jie Zheng, Yan Gao, Yiqun He, Junhao Shang, Xuequn BMC Bioinformatics Research BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors. BioMed Central 2019-05-01 /pmc/articles/PMC6509868/ /pubmed/31074373 http://dx.doi.org/10.1186/s12859-019-2735-3 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Hu, Jialu Wang, Jingru Lin, Jianan Liu, Tianwei Zhong, Yuanke Liu, Jie Zheng, Yan Gao, Yiqun He, Junhao Shang, Xuequn MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title | MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title_full | MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title_fullStr | MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title_full_unstemmed | MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title_short | MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites |
title_sort | md-svm: a novel svm-based algorithm for the motif discovery of transcription factor binding sites |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6509868/ https://www.ncbi.nlm.nih.gov/pubmed/31074373 http://dx.doi.org/10.1186/s12859-019-2735-3 |
work_keys_str_mv | AT hujialu mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT wangjingru mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT linjianan mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT liutianwei mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT zhongyuanke mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT liujie mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT zhengyan mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT gaoyiqun mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT hejunhao mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites AT shangxuequn mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites |