Cargando…
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692045/ https://www.ncbi.nlm.nih.gov/pubmed/23771147 http://dx.doi.org/10.1093/nar/gkt519 |
_version_ | 1782274556965158912 |
---|---|
author | Fletez-Brant, Christopher Lee, Dongwon McCallion, Andrew S. Beer, Michael A. |
author_facet | Fletez-Brant, Christopher Lee, Dongwon McCallion, Andrew S. Beer, Michael A. |
author_sort | Fletez-Brant, Christopher |
collection | PubMed |
description | Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. |
format | Online Article Text |
id | pubmed-3692045 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-36920452013-06-25 kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets Fletez-Brant, Christopher Lee, Dongwon McCallion, Andrew S. Beer, Michael A. Nucleic Acids Res Articles Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. Oxford University Press 2013-07 2013-06-14 /pmc/articles/PMC3692045/ /pubmed/23771147 http://dx.doi.org/10.1093/nar/gkt519 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Articles Fletez-Brant, Christopher Lee, Dongwon McCallion, Andrew S. Beer, Michael A. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title | kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title_full | kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title_fullStr | kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title_full_unstemmed | kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title_short | kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets |
title_sort | kmer-svm: a web server for identifying predictive regulatory sequence features in genomic data sets |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692045/ https://www.ncbi.nlm.nih.gov/pubmed/23771147 http://dx.doi.org/10.1093/nar/gkt519 |
work_keys_str_mv | AT fletezbrantchristopher kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets AT leedongwon kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets AT mccallionandrews kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets AT beermichaela kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets |