Cargando…

An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine

Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous c...

Descripción completa

Detalles Bibliográficos
Autores principales: Cui, Song, Youn, Eunseog, Lee, Joohyun, Maas, Stephan J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990533/
https://www.ncbi.nlm.nih.gov/pubmed/24743548
http://dx.doi.org/10.1371/journal.pone.0094519
_version_ 1782312296768339968
author Cui, Song
Youn, Eunseog
Lee, Joohyun
Maas, Stephan J.
author_facet Cui, Song
Youn, Eunseog
Lee, Joohyun
Maas, Stephan J.
author_sort Cui, Song
collection PubMed
description Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results.
format Online
Article
Text
id pubmed-3990533
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39905332014-04-21 An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine Cui, Song Youn, Eunseog Lee, Joohyun Maas, Stephan J. PLoS One Research Article Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results. Public Library of Science 2014-04-17 /pmc/articles/PMC3990533/ /pubmed/24743548 http://dx.doi.org/10.1371/journal.pone.0094519 Text en © 2014 Cui et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Cui, Song
Youn, Eunseog
Lee, Joohyun
Maas, Stephan J.
An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title_full An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title_fullStr An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title_full_unstemmed An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title_short An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
title_sort improved systematic approach to predicting transcription factor target genes using support vector machine
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990533/
https://www.ncbi.nlm.nih.gov/pubmed/24743548
http://dx.doi.org/10.1371/journal.pone.0094519
work_keys_str_mv AT cuisong animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT youneunseog animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT leejoohyun animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT maasstephanj animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT cuisong improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT youneunseog improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT leejoohyun improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine
AT maasstephanj improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine