Cargando…

Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases

Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact...

Descripción completa

Detalles Bibliográficos
Autores principales: Schadt, Eric E., Banerjee, Onureena, Fang, Gang, Feng, Zhixing, Wong, Wing H., Zhang, Xuegong, Kislyuk, Andrey, Clark, Tyson A., Luong, Khai, Keren-Paz, Alona, Chess, Andrew, Kumar, Vipin, Chen-Plotkin, Alice, Sondheimer, Neal, Korlach, Jonas, Kasarskis, Andrew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530673/
https://www.ncbi.nlm.nih.gov/pubmed/23093720
http://dx.doi.org/10.1101/gr.136739.111
_version_ 1782254044859858944
author Schadt, Eric E.
Banerjee, Onureena
Fang, Gang
Feng, Zhixing
Wong, Wing H.
Zhang, Xuegong
Kislyuk, Andrey
Clark, Tyson A.
Luong, Khai
Keren-Paz, Alona
Chess, Andrew
Kumar, Vipin
Chen-Plotkin, Alice
Sondheimer, Neal
Korlach, Jonas
Kasarskis, Andrew
author_facet Schadt, Eric E.
Banerjee, Onureena
Fang, Gang
Feng, Zhixing
Wong, Wing H.
Zhang, Xuegong
Kislyuk, Andrey
Clark, Tyson A.
Luong, Khai
Keren-Paz, Alona
Chess, Andrew
Kumar, Vipin
Chen-Plotkin, Alice
Sondheimer, Neal
Korlach, Jonas
Kasarskis, Andrew
author_sort Schadt, Eric E.
collection PubMed
description Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types.
format Online
Article
Text
id pubmed-3530673
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-35306732013-01-01 Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases Schadt, Eric E. Banerjee, Onureena Fang, Gang Feng, Zhixing Wong, Wing H. Zhang, Xuegong Kislyuk, Andrey Clark, Tyson A. Luong, Khai Keren-Paz, Alona Chess, Andrew Kumar, Vipin Chen-Plotkin, Alice Sondheimer, Neal Korlach, Jonas Kasarskis, Andrew Genome Res Method Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types. Cold Spring Harbor Laboratory Press 2013-01 /pmc/articles/PMC3530673/ /pubmed/23093720 http://dx.doi.org/10.1101/gr.136739.111 Text en © 2013, Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Schadt, Eric E.
Banerjee, Onureena
Fang, Gang
Feng, Zhixing
Wong, Wing H.
Zhang, Xuegong
Kislyuk, Andrey
Clark, Tyson A.
Luong, Khai
Keren-Paz, Alona
Chess, Andrew
Kumar, Vipin
Chen-Plotkin, Alice
Sondheimer, Neal
Korlach, Jonas
Kasarskis, Andrew
Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title_full Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title_fullStr Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title_full_unstemmed Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title_short Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases
title_sort modeling kinetic rate variation in third generation dna sequencing data to detect putative modifications to dna bases
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530673/
https://www.ncbi.nlm.nih.gov/pubmed/23093720
http://dx.doi.org/10.1101/gr.136739.111
work_keys_str_mv AT schadterice modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT banerjeeonureena modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT fanggang modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT fengzhixing modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT wongwingh modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT zhangxuegong modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT kislyukandrey modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT clarktysona modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT luongkhai modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT kerenpazalona modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT chessandrew modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT kumarvipin modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT chenplotkinalice modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT sondheimerneal modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT korlachjonas modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases
AT kasarskisandrew modelingkineticratevariationinthirdgenerationdnasequencingdatatodetectputativemodificationstodnabases