Cargando…

Umap and Bismap: quantifying genome and methylome mappability

Short-read sequencing enables assessment of genetic and biochemical traits of individual genomic regions, such as the location of genetic variation, protein binding and chemical modifications. Every region in a genome assembly has a property called ‘mappability’, which measures the extent to which i...

Descripción completa

Detalles Bibliográficos
Autores principales: Karimzadeh, Mehran, Ernst, Carl, Kundaje, Anshul, Hoffman, Michael M
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6237805/
https://www.ncbi.nlm.nih.gov/pubmed/30169659
http://dx.doi.org/10.1093/nar/gky677
_version_ 1783371244698075136
author Karimzadeh, Mehran
Ernst, Carl
Kundaje, Anshul
Hoffman, Michael M
author_facet Karimzadeh, Mehran
Ernst, Carl
Kundaje, Anshul
Hoffman, Michael M
author_sort Karimzadeh, Mehran
collection PubMed
description Short-read sequencing enables assessment of genetic and biochemical traits of individual genomic regions, such as the location of genetic variation, protein binding and chemical modifications. Every region in a genome assembly has a property called ‘mappability’, which measures the extent to which it can be uniquely mapped by sequence reads. In regions of lower mappability, estimates of genomic and epigenomic characteristics from sequencing assays are less reliable. These regions have increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Bisulfite sequencing approaches used to identify DNA methylation exacerbate these problems by introducing large numbers of reads that map to multiple regions. Both to correct assumptions of uniformity in downstream analysis and to identify regions where the analysis is less reliable, it is necessary to know the mappability of both ordinary and bisulfite-converted genomes. We introduce the Umap software for identifying uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite-converted genome. A Umap and Bismap track hub for human genome assemblies GRCh37/hg19 and GRCh38/hg38, and mouse assemblies GRCm37/mm9 and GRCm38/mm10 is available at https://bismap.hoffmanlab.org for use with genome browsers.
format Online
Article
Text
id pubmed-6237805
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-62378052018-11-21 Umap and Bismap: quantifying genome and methylome mappability Karimzadeh, Mehran Ernst, Carl Kundaje, Anshul Hoffman, Michael M Nucleic Acids Res Methods Online Short-read sequencing enables assessment of genetic and biochemical traits of individual genomic regions, such as the location of genetic variation, protein binding and chemical modifications. Every region in a genome assembly has a property called ‘mappability’, which measures the extent to which it can be uniquely mapped by sequence reads. In regions of lower mappability, estimates of genomic and epigenomic characteristics from sequencing assays are less reliable. These regions have increased susceptibility to spurious mapping from reads from other regions of the genome with sequencing errors or unexpected genetic variation. Bisulfite sequencing approaches used to identify DNA methylation exacerbate these problems by introducing large numbers of reads that map to multiple regions. Both to correct assumptions of uniformity in downstream analysis and to identify regions where the analysis is less reliable, it is necessary to know the mappability of both ordinary and bisulfite-converted genomes. We introduce the Umap software for identifying uniquely mappable regions of any genome. Its Bismap extension identifies mappability of the bisulfite-converted genome. A Umap and Bismap track hub for human genome assemblies GRCh37/hg19 and GRCh38/hg38, and mouse assemblies GRCm37/mm9 and GRCm38/mm10 is available at https://bismap.hoffmanlab.org for use with genome browsers. Oxford University Press 2018-11-16 2018-08-30 /pmc/articles/PMC6237805/ /pubmed/30169659 http://dx.doi.org/10.1093/nar/gky677 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Karimzadeh, Mehran
Ernst, Carl
Kundaje, Anshul
Hoffman, Michael M
Umap and Bismap: quantifying genome and methylome mappability
title Umap and Bismap: quantifying genome and methylome mappability
title_full Umap and Bismap: quantifying genome and methylome mappability
title_fullStr Umap and Bismap: quantifying genome and methylome mappability
title_full_unstemmed Umap and Bismap: quantifying genome and methylome mappability
title_short Umap and Bismap: quantifying genome and methylome mappability
title_sort umap and bismap: quantifying genome and methylome mappability
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6237805/
https://www.ncbi.nlm.nih.gov/pubmed/30169659
http://dx.doi.org/10.1093/nar/gky677
work_keys_str_mv AT karimzadehmehran umapandbismapquantifyinggenomeandmethylomemappability
AT ernstcarl umapandbismapquantifyinggenomeandmethylomemappability
AT kundajeanshul umapandbismapquantifyinggenomeandmethylomemappability
AT hoffmanmichaelm umapandbismapquantifyinggenomeandmethylomemappability