Cargando…

MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites

BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central signific...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Jialu, Wang, Jingru, Lin, Jianan, Liu, Tianwei, Zhong, Yuanke, Liu, Jie, Zheng, Yan, Gao, Yiqun, He, Junhao, Shang, Xuequn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6509868/
https://www.ncbi.nlm.nih.gov/pubmed/31074373
http://dx.doi.org/10.1186/s12859-019-2735-3
_version_ 1783417335916265472
author Hu, Jialu
Wang, Jingru
Lin, Jianan
Liu, Tianwei
Zhong, Yuanke
Liu, Jie
Zheng, Yan
Gao, Yiqun
He, Junhao
Shang, Xuequn
author_facet Hu, Jialu
Wang, Jingru
Lin, Jianan
Liu, Tianwei
Zhong, Yuanke
Liu, Jie
Zheng, Yan
Gao, Yiqun
He, Junhao
Shang, Xuequn
author_sort Hu, Jialu
collection PubMed
description BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors.
format Online
Article
Text
id pubmed-6509868
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-65098682019-06-05 MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites Hu, Jialu Wang, Jingru Lin, Jianan Liu, Tianwei Zhong, Yuanke Liu, Jie Zheng, Yan Gao, Yiqun He, Junhao Shang, Xuequn BMC Bioinformatics Research BACKGROUND: Transcription factors (TFs) play important roles in the regulation of gene expression. They can activate or block transcription of downstream genes in a manner of binding to specific genomic sequences. Therefore, motif discovery of these binding preference patterns is of central significance in the understanding of molecular regulation mechanism. Many algorithms have been proposed for the identification of transcription factor binding sites. However, it remains a challengeable problem. RESULTS: Here, we proposed a novel motif discovery algorithm based on support vector machine (MD-SVM) to learn a discriminative model for TF binding sites. MD-SVM firstly obtains position weight matrix (PWM) from a set of training datasets. Then it translates the MD problem into a computational framework of multiple instance learning (MIL). It was applied to several real biological datasets. Results show that our algorithm outperforms MI-SVM in terms of both accuracy and specificity. CONCLUSIONS: In this paper, we modeled the TF motif discovery problem as a MIL optimization problem. The SVM algorithm was adapted to discriminate positive and negative bags of instances. Compared to other svm-based algorithms, MD-SVM show its superiority over its competitors in term of ROC AUC. Hopefully, it could be of benefit to the research community in the understanding of molecular functions of DNA functional elements and transcription factors. BioMed Central 2019-05-01 /pmc/articles/PMC6509868/ /pubmed/31074373 http://dx.doi.org/10.1186/s12859-019-2735-3 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Hu, Jialu
Wang, Jingru
Lin, Jianan
Liu, Tianwei
Zhong, Yuanke
Liu, Jie
Zheng, Yan
Gao, Yiqun
He, Junhao
Shang, Xuequn
MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title_full MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title_fullStr MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title_full_unstemmed MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title_short MD-SVM: a novel SVM-based algorithm for the motif discovery of transcription factor binding sites
title_sort md-svm: a novel svm-based algorithm for the motif discovery of transcription factor binding sites
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6509868/
https://www.ncbi.nlm.nih.gov/pubmed/31074373
http://dx.doi.org/10.1186/s12859-019-2735-3
work_keys_str_mv AT hujialu mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT wangjingru mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT linjianan mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT liutianwei mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT zhongyuanke mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT liujie mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT zhengyan mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT gaoyiqun mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT hejunhao mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites
AT shangxuequn mdsvmanovelsvmbasedalgorithmforthemotifdiscoveryoftranscriptionfactorbindingsites