Cargando…

A Method for Identification of the Methylation Level of CpG Islands From NGS Data

In the course of sample preparation for Next Generation Sequencing (NGS), DNA is fragmented by various methods. Fragmentation shows a persistent bias with regard to the cleavage rates of various dinucleotides. With the exception of CpG dinucleotides the previously described biases were consistent wi...

Descripción completa

Detalles Bibliográficos
Autores principales: Uroshlev, Leonid A., Abdullaev, Eldar T., Umarova, Iren R., Il’icheva, Irina A., Panchenko, Larisa A., Polozov, Robert V., Kondrashov, Fyodor A., Nechipurenko, Yury D., Grokhovsky, Sergei L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248081/
https://www.ncbi.nlm.nih.gov/pubmed/32451390
http://dx.doi.org/10.1038/s41598-020-65406-1
_version_ 1783538291698565120
author Uroshlev, Leonid A.
Abdullaev, Eldar T.
Umarova, Iren R.
Il’icheva, Irina A.
Panchenko, Larisa A.
Polozov, Robert V.
Kondrashov, Fyodor A.
Nechipurenko, Yury D.
Grokhovsky, Sergei L.
author_facet Uroshlev, Leonid A.
Abdullaev, Eldar T.
Umarova, Iren R.
Il’icheva, Irina A.
Panchenko, Larisa A.
Polozov, Robert V.
Kondrashov, Fyodor A.
Nechipurenko, Yury D.
Grokhovsky, Sergei L.
author_sort Uroshlev, Leonid A.
collection PubMed
description In the course of sample preparation for Next Generation Sequencing (NGS), DNA is fragmented by various methods. Fragmentation shows a persistent bias with regard to the cleavage rates of various dinucleotides. With the exception of CpG dinucleotides the previously described biases were consistent with results of the DNA cleavage in solution. Here we computed cleavage rates of all dinucleotides including the methylated CpG and unmethylated CpG dinucleotides using data of the Whole Genome Sequencing datasets of the 1000 Genomes project. We found that the cleavage rate of CpG is significantly higher for the methylated CpG dinucleotides. Using this information, we developed a classifier for distinguishing cancer and healthy tissues based on their CpG islands statuses of the fragmentation. A simple Support Vector Machine classifier based on this algorithm shows an accuracy of 84%. The proposed method allows the detection of epigenetic markers purely based on mechanochemical DNA fragmentation, which can be detected by a simple analysis of the NGS sequencing data.
format Online
Article
Text
id pubmed-7248081
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-72480812020-06-04 A Method for Identification of the Methylation Level of CpG Islands From NGS Data Uroshlev, Leonid A. Abdullaev, Eldar T. Umarova, Iren R. Il’icheva, Irina A. Panchenko, Larisa A. Polozov, Robert V. Kondrashov, Fyodor A. Nechipurenko, Yury D. Grokhovsky, Sergei L. Sci Rep Article In the course of sample preparation for Next Generation Sequencing (NGS), DNA is fragmented by various methods. Fragmentation shows a persistent bias with regard to the cleavage rates of various dinucleotides. With the exception of CpG dinucleotides the previously described biases were consistent with results of the DNA cleavage in solution. Here we computed cleavage rates of all dinucleotides including the methylated CpG and unmethylated CpG dinucleotides using data of the Whole Genome Sequencing datasets of the 1000 Genomes project. We found that the cleavage rate of CpG is significantly higher for the methylated CpG dinucleotides. Using this information, we developed a classifier for distinguishing cancer and healthy tissues based on their CpG islands statuses of the fragmentation. A simple Support Vector Machine classifier based on this algorithm shows an accuracy of 84%. The proposed method allows the detection of epigenetic markers purely based on mechanochemical DNA fragmentation, which can be detected by a simple analysis of the NGS sequencing data. Nature Publishing Group UK 2020-05-25 /pmc/articles/PMC7248081/ /pubmed/32451390 http://dx.doi.org/10.1038/s41598-020-65406-1 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Uroshlev, Leonid A.
Abdullaev, Eldar T.
Umarova, Iren R.
Il’icheva, Irina A.
Panchenko, Larisa A.
Polozov, Robert V.
Kondrashov, Fyodor A.
Nechipurenko, Yury D.
Grokhovsky, Sergei L.
A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title_full A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title_fullStr A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title_full_unstemmed A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title_short A Method for Identification of the Methylation Level of CpG Islands From NGS Data
title_sort method for identification of the methylation level of cpg islands from ngs data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7248081/
https://www.ncbi.nlm.nih.gov/pubmed/32451390
http://dx.doi.org/10.1038/s41598-020-65406-1
work_keys_str_mv AT uroshlevleonida amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT abdullaeveldart amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT umarovairenr amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT ilichevairinaa amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT panchenkolarisaa amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT polozovrobertv amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT kondrashovfyodora amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT nechipurenkoyuryd amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT grokhovskysergeil amethodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT uroshlevleonida methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT abdullaeveldart methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT umarovairenr methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT ilichevairinaa methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT panchenkolarisaa methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT polozovrobertv methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT kondrashovfyodora methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT nechipurenkoyuryd methodforidentificationofthemethylationlevelofcpgislandsfromngsdata
AT grokhovskysergeil methodforidentificationofthemethylationlevelofcpgislandsfromngsdata