Cargando…

kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-...

Descripción completa

Detalles Bibliográficos
Autores principales: Fletez-Brant, Christopher, Lee, Dongwon, McCallion, Andrew S., Beer, Michael A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692045/
https://www.ncbi.nlm.nih.gov/pubmed/23771147
http://dx.doi.org/10.1093/nar/gkt519
_version_ 1782274556965158912
author Fletez-Brant, Christopher
Lee, Dongwon
McCallion, Andrew S.
Beer, Michael A.
author_facet Fletez-Brant, Christopher
Lee, Dongwon
McCallion, Andrew S.
Beer, Michael A.
author_sort Fletez-Brant, Christopher
collection PubMed
description Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
format Online
Article
Text
id pubmed-3692045
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-36920452013-06-25 kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets Fletez-Brant, Christopher Lee, Dongwon McCallion, Andrew S. Beer, Michael A. Nucleic Acids Res Articles Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. Oxford University Press 2013-07 2013-06-14 /pmc/articles/PMC3692045/ /pubmed/23771147 http://dx.doi.org/10.1093/nar/gkt519 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Articles
Fletez-Brant, Christopher
Lee, Dongwon
McCallion, Andrew S.
Beer, Michael A.
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title_full kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title_fullStr kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title_full_unstemmed kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title_short kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets
title_sort kmer-svm: a web server for identifying predictive regulatory sequence features in genomic data sets
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692045/
https://www.ncbi.nlm.nih.gov/pubmed/23771147
http://dx.doi.org/10.1093/nar/gkt519
work_keys_str_mv AT fletezbrantchristopher kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets
AT leedongwon kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets
AT mccallionandrews kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets
AT beermichaela kmersvmawebserverforidentifyingpredictiveregulatorysequencefeaturesingenomicdatasets