Cargando…

MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first alig...

Descripción completa

Detalles Bibliográficos
Autores principales: Barturen, Guillermo, Rueda, Antonio, Oliver, José L., Hackenberg, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: F1000Research 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3938178/
https://www.ncbi.nlm.nih.gov/pubmed/24627790
http://dx.doi.org/10.12688/f1000research.2-217.v2
_version_ 1782305574706216960
author Barturen, Guillermo
Rueda, Antonio
Oliver, José L.
Hackenberg, Michael
author_facet Barturen, Guillermo
Rueda, Antonio
Oliver, José L.
Hackenberg, Michael
author_sort Barturen, Guillermo
collection PubMed
description Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants. We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP. MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.
format Online
Article
Text
id pubmed-3938178
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher F1000Research
record_format MEDLINE/PubMed
spelling pubmed-39381782014-03-12 MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data Barturen, Guillermo Rueda, Antonio Oliver, José L. Hackenberg, Michael F1000Res Method Article Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants. We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP. MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144. F1000Research 2014-02-21 /pmc/articles/PMC3938178/ /pubmed/24627790 http://dx.doi.org/10.12688/f1000research.2-217.v2 Text en Copyright: © 2014 Barturen G et al. http://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/publicdomain/zero/1.0/ Data associated with the article are available under the terms of the Creative Commons Zero "No rights reserved" data waiver (CC0 1.0 Public domain dedication).
spellingShingle Method Article
Barturen, Guillermo
Rueda, Antonio
Oliver, José L.
Hackenberg, Michael
MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title_full MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title_fullStr MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title_full_unstemmed MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title_short MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data
title_sort methylextract: high-quality methylation maps and snv calling from whole genome bisulfite sequencing data
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3938178/
https://www.ncbi.nlm.nih.gov/pubmed/24627790
http://dx.doi.org/10.12688/f1000research.2-217.v2
work_keys_str_mv AT barturenguillermo methylextracthighqualitymethylationmapsandsnvcallingfromwholegenomebisulfitesequencingdata
AT ruedaantonio methylextracthighqualitymethylationmapsandsnvcallingfromwholegenomebisulfitesequencingdata
AT oliverjosel methylextracthighqualitymethylationmapsandsnvcallingfromwholegenomebisulfitesequencingdata
AT hackenbergmichael methylextracthighqualitymethylationmapsandsnvcallingfromwholegenomebisulfitesequencingdata