Cargando…

LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data

One big limitation of computational tools for analyzing ChIP-seq data is that most of them ignore non-unique tags (NUTs) that match the human genome even though NUTs comprise up to 60% of all raw tags in ChIP-seq data. Effectively utilizing these NUTs would increase the sequencing depth and allow a...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Rui, Hsu, Hang-Kai, Blattler, Adam, Wang, Yisong, Lan, Xun, Wang, Yao, Hsu, Pei-Yin, Leu, Yu-Wei, Huang, Tim H.-M., Farnham, Peggy J., Jin, Victor X.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692479/
https://www.ncbi.nlm.nih.gov/pubmed/23825685
http://dx.doi.org/10.1371/journal.pone.0067788
_version_ 1782274622493818880
author Wang, Rui
Hsu, Hang-Kai
Blattler, Adam
Wang, Yisong
Lan, Xun
Wang, Yao
Hsu, Pei-Yin
Leu, Yu-Wei
Huang, Tim H.-M.
Farnham, Peggy J.
Jin, Victor X.
author_facet Wang, Rui
Hsu, Hang-Kai
Blattler, Adam
Wang, Yisong
Lan, Xun
Wang, Yao
Hsu, Pei-Yin
Leu, Yu-Wei
Huang, Tim H.-M.
Farnham, Peggy J.
Jin, Victor X.
author_sort Wang, Rui
collection PubMed
description One big limitation of computational tools for analyzing ChIP-seq data is that most of them ignore non-unique tags (NUTs) that match the human genome even though NUTs comprise up to 60% of all raw tags in ChIP-seq data. Effectively utilizing these NUTs would increase the sequencing depth and allow a more accurate detection of enriched binding sites, which in turn could lead to more precise and significant biological interpretations. In this study, we have developed a computational tool, LOcating Non-Unique matched Tags (LONUT), to improve the detection of enriched regions from ChIP-seq data. Our LONUT algorithm applies a linear and polynomial regression model to establish an empirical score (ES) formula by considering two influential factors, the distance of NUTs to peaks identified using uniquely matched tags (UMTs) and the enrichment score for those peaks resulting in each NUT being assigned to a unique location on the reference genome. The newly located tags from the set of NUTs are combined with the original UMTs to produce a final set of combined matched tags (CMTs). LONUT was tested on many different datasets representing three different characteristics of biological data types. The detected sites were validated using de novo motif discovery and ChIP-PCR. We demonstrate the specificity and accuracy of LONUT and show that our program not only improves the detection of binding sites for ChIP-seq, but also identifies additional binding sites.
format Online
Article
Text
id pubmed-3692479
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-36924792013-07-02 LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data Wang, Rui Hsu, Hang-Kai Blattler, Adam Wang, Yisong Lan, Xun Wang, Yao Hsu, Pei-Yin Leu, Yu-Wei Huang, Tim H.-M. Farnham, Peggy J. Jin, Victor X. PLoS One Research Article One big limitation of computational tools for analyzing ChIP-seq data is that most of them ignore non-unique tags (NUTs) that match the human genome even though NUTs comprise up to 60% of all raw tags in ChIP-seq data. Effectively utilizing these NUTs would increase the sequencing depth and allow a more accurate detection of enriched binding sites, which in turn could lead to more precise and significant biological interpretations. In this study, we have developed a computational tool, LOcating Non-Unique matched Tags (LONUT), to improve the detection of enriched regions from ChIP-seq data. Our LONUT algorithm applies a linear and polynomial regression model to establish an empirical score (ES) formula by considering two influential factors, the distance of NUTs to peaks identified using uniquely matched tags (UMTs) and the enrichment score for those peaks resulting in each NUT being assigned to a unique location on the reference genome. The newly located tags from the set of NUTs are combined with the original UMTs to produce a final set of combined matched tags (CMTs). LONUT was tested on many different datasets representing three different characteristics of biological data types. The detected sites were validated using de novo motif discovery and ChIP-PCR. We demonstrate the specificity and accuracy of LONUT and show that our program not only improves the detection of binding sites for ChIP-seq, but also identifies additional binding sites. Public Library of Science 2013-06-25 /pmc/articles/PMC3692479/ /pubmed/23825685 http://dx.doi.org/10.1371/journal.pone.0067788 Text en © 2013 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wang, Rui
Hsu, Hang-Kai
Blattler, Adam
Wang, Yisong
Lan, Xun
Wang, Yao
Hsu, Pei-Yin
Leu, Yu-Wei
Huang, Tim H.-M.
Farnham, Peggy J.
Jin, Victor X.
LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title_full LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title_fullStr LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title_full_unstemmed LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title_short LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data
title_sort locating non-unique matched tags (lonut) to improve the detection of the enriched regions for chip-seq data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692479/
https://www.ncbi.nlm.nih.gov/pubmed/23825685
http://dx.doi.org/10.1371/journal.pone.0067788
work_keys_str_mv AT wangrui locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT hsuhangkai locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT blattleradam locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT wangyisong locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT lanxun locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT wangyao locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT hsupeiyin locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT leuyuwei locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT huangtimhm locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT farnhampeggyj locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata
AT jinvictorx locatingnonuniquematchedtagslonuttoimprovethedetectionoftheenrichedregionsforchipseqdata