Cargando…

Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis

Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to...

Descripción completa

Detalles Bibliográficos
Autores principales: Nunn, Adam, Otto, Christian, Stadler, Peter F, Langenberger, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425420/
https://www.ncbi.nlm.nih.gov/pubmed/33624017
http://dx.doi.org/10.1093/bib/bbab021
_version_ 1783749846364061696
author Nunn, Adam
Otto, Christian
Stadler, Peter F
Langenberger, David
author_facet Nunn, Adam
Otto, Christian
Stadler, Peter F
Langenberger, David
author_sort Nunn, Adam
collection PubMed
description Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.
format Online
Article
Text
id pubmed-8425420
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84254202021-09-09 Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis Nunn, Adam Otto, Christian Stadler, Peter F Langenberger, David Brief Bioinform Method Review Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads. Oxford University Press 2021-02-23 /pmc/articles/PMC8425420/ /pubmed/33624017 http://dx.doi.org/10.1093/bib/bbab021 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Method Review
Nunn, Adam
Otto, Christian
Stadler, Peter F
Langenberger, David
Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title_full Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title_fullStr Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title_full_unstemmed Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title_short Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
title_sort comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to dna methylation analysis
topic Method Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425420/
https://www.ncbi.nlm.nih.gov/pubmed/33624017
http://dx.doi.org/10.1093/bib/bbab021
work_keys_str_mv AT nunnadam comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis
AT ottochristian comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis
AT stadlerpeterf comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis
AT langenbergerdavid comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis