Cargando…
An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine
Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous c...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990533/ https://www.ncbi.nlm.nih.gov/pubmed/24743548 http://dx.doi.org/10.1371/journal.pone.0094519 |
_version_ | 1782312296768339968 |
---|---|
author | Cui, Song Youn, Eunseog Lee, Joohyun Maas, Stephan J. |
author_facet | Cui, Song Youn, Eunseog Lee, Joohyun Maas, Stephan J. |
author_sort | Cui, Song |
collection | PubMed |
description | Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results. |
format | Online Article Text |
id | pubmed-3990533 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-39905332014-04-21 An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine Cui, Song Youn, Eunseog Lee, Joohyun Maas, Stephan J. PLoS One Research Article Biological prediction of transcription factor binding sites and their corresponding transcription factor target genes (TFTGs) makes great contribution to understanding the gene regulatory networks. However, these approaches are based on laborious and time-consuming biological experiments. Numerous computational approaches have shown great potential to circumvent laborious biological methods. However, the majority of these algorithms provide limited performances and fail to consider the structural property of the datasets. We proposed a refined systematic computational approach for predicting TFTGs. Based on previous work done on identifying auxin response factor target genes from Arabidopsis thaliana co-expression data, we adopted a novel reverse-complementary distance-sensitive n-gram profile algorithm. This algorithm converts each upstream sub-sequence into a high-dimensional vector data point and transforms the prediction task into a classification problem using support vector machine-based classifier. Our approach showed significant improvement compared to other computational methods based on the area under curve value of the receiver operating characteristic curve using 10-fold cross validation. In addition, in the light of the highly skewed structure of the dataset, we also evaluated other metrics and their associated curves, such as precision-recall curves and cost curves, which provided highly satisfactory results. Public Library of Science 2014-04-17 /pmc/articles/PMC3990533/ /pubmed/24743548 http://dx.doi.org/10.1371/journal.pone.0094519 Text en © 2014 Cui et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Cui, Song Youn, Eunseog Lee, Joohyun Maas, Stephan J. An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title | An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title_full | An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title_fullStr | An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title_full_unstemmed | An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title_short | An Improved Systematic Approach to Predicting Transcription Factor Target Genes Using Support Vector Machine |
title_sort | improved systematic approach to predicting transcription factor target genes using support vector machine |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3990533/ https://www.ncbi.nlm.nih.gov/pubmed/24743548 http://dx.doi.org/10.1371/journal.pone.0094519 |
work_keys_str_mv | AT cuisong animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT youneunseog animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT leejoohyun animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT maasstephanj animprovedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT cuisong improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT youneunseog improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT leejoohyun improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine AT maasstephanj improvedsystematicapproachtopredictingtranscriptionfactortargetgenesusingsupportvectormachine |