Cargando…
Systematic bias in high-throughput sequencing data and its correction by BEADS
Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as ch...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3159482/ https://www.ncbi.nlm.nih.gov/pubmed/21646344 http://dx.doi.org/10.1093/nar/gkr425 |
_version_ | 1782210477680492544 |
---|---|
author | Cheung, Ming-Sin Down, Thomas A. Latorre, Isabel Ahringer, Julie |
author_facet | Cheung, Ming-Sin Down, Thomas A. Latorre, Isabel Ahringer, Julie |
author_sort | Cheung, Ming-Sin |
collection | PubMed |
description | Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina’s Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses. |
format | Online Article Text |
id | pubmed-3159482 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-31594822011-08-22 Systematic bias in high-throughput sequencing data and its correction by BEADS Cheung, Ming-Sin Down, Thomas A. Latorre, Isabel Ahringer, Julie Nucleic Acids Res Methods Online Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina’s Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses. Oxford University Press 2011-08 2011-06-06 /pmc/articles/PMC3159482/ /pubmed/21646344 http://dx.doi.org/10.1093/nar/gkr425 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Cheung, Ming-Sin Down, Thomas A. Latorre, Isabel Ahringer, Julie Systematic bias in high-throughput sequencing data and its correction by BEADS |
title | Systematic bias in high-throughput sequencing data and its correction by BEADS |
title_full | Systematic bias in high-throughput sequencing data and its correction by BEADS |
title_fullStr | Systematic bias in high-throughput sequencing data and its correction by BEADS |
title_full_unstemmed | Systematic bias in high-throughput sequencing data and its correction by BEADS |
title_short | Systematic bias in high-throughput sequencing data and its correction by BEADS |
title_sort | systematic bias in high-throughput sequencing data and its correction by beads |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3159482/ https://www.ncbi.nlm.nih.gov/pubmed/21646344 http://dx.doi.org/10.1093/nar/gkr425 |
work_keys_str_mv | AT cheungmingsin systematicbiasinhighthroughputsequencingdataanditscorrectionbybeads AT downthomasa systematicbiasinhighthroughputsequencingdataanditscorrectionbybeads AT latorreisabel systematicbiasinhighthroughputsequencingdataanditscorrectionbybeads AT ahringerjulie systematicbiasinhighthroughputsequencingdataanditscorrectionbybeads |