Cargando…
Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis
Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425420/ https://www.ncbi.nlm.nih.gov/pubmed/33624017 http://dx.doi.org/10.1093/bib/bbab021 |
_version_ | 1783749846364061696 |
---|---|
author | Nunn, Adam Otto, Christian Stadler, Peter F Langenberger, David |
author_facet | Nunn, Adam Otto, Christian Stadler, Peter F Langenberger, David |
author_sort | Nunn, Adam |
collection | PubMed |
description | Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads. |
format | Online Article Text |
id | pubmed-8425420 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84254202021-09-09 Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis Nunn, Adam Otto, Christian Stadler, Peter F Langenberger, David Brief Bioinform Method Review Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads. Oxford University Press 2021-02-23 /pmc/articles/PMC8425420/ /pubmed/33624017 http://dx.doi.org/10.1093/bib/bbab021 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Method Review Nunn, Adam Otto, Christian Stadler, Peter F Langenberger, David Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title | Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title_full | Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title_fullStr | Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title_full_unstemmed | Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title_short | Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis |
title_sort | comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to dna methylation analysis |
topic | Method Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425420/ https://www.ncbi.nlm.nih.gov/pubmed/33624017 http://dx.doi.org/10.1093/bib/bbab021 |
work_keys_str_mv | AT nunnadam comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis AT ottochristian comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis AT stadlerpeterf comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis AT langenbergerdavid comprehensivebenchmarkingofsoftwareformappingwholegenomebisulfitedatafromreadalignmenttodnamethylationanalysis |