Cargando…

Detecting epigenetic motifs in low coverage and metagenomics settings

BACKGROUND: It has recently become possible to rapidly and accurately detect epigenetic signatures in bacterial genomes using third generation sequencing data. Monitoring the speed at which a single polymerase inserts a base in the read strand enables one to infer whether a modification is present a...

Descripción completa

Detalles Bibliográficos
Autores principales: Beckmann, Noam D, Karri, Sashank, Fang, Gang, Bashir, Ali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168715/
https://www.ncbi.nlm.nih.gov/pubmed/25253358
http://dx.doi.org/10.1186/1471-2105-15-S9-S16
_version_ 1782335605217165312
author Beckmann, Noam D
Karri, Sashank
Fang, Gang
Bashir, Ali
author_facet Beckmann, Noam D
Karri, Sashank
Fang, Gang
Bashir, Ali
author_sort Beckmann, Noam D
collection PubMed
description BACKGROUND: It has recently become possible to rapidly and accurately detect epigenetic signatures in bacterial genomes using third generation sequencing data. Monitoring the speed at which a single polymerase inserts a base in the read strand enables one to infer whether a modification is present at that specific site on the template strand. These sites can be challenging to detect in the absence of high coverage and reliable reference genomes. METHODS: Here we provide a new method for detecting epigenetic motifs in bacteria on datasets with low-coverage, with incomplete references, and with mixed samples (i.e. metagenomic data). Our approach treats motif inference as a kmer comparison problem. First, genomes (or contigs) are deconstructed into kmers. Then, native genome-wide distributions of interpulse durations (IPDs) for kmers are compared with corresponding whole genome amplified (WGA, modification free) IPD distributions using log likelihood ratios. Finally, kmers are ranked and greedily selected by iteratively correcting for sequences within a particular kmer's neighborhood. CONCLUSIONS: Our method can detect multiple types of modifications, even at very low-coverage and in the presence of mixed genomes. Additionally, we are able to predict modified motifs when genomes with "neighbor" modified motifs exist within the sample. Lastly, we show that these motifs can provide an alternative source of information by which to cluster metagenomics contigs and that iterative refinement on these clustered contigs can further improve both sensitivity and specificity of motif detection. AVAILABILITY: https://github.com/alibashir/EMMCKmer
format Online
Article
Text
id pubmed-4168715
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41687152014-10-02 Detecting epigenetic motifs in low coverage and metagenomics settings Beckmann, Noam D Karri, Sashank Fang, Gang Bashir, Ali BMC Bioinformatics Proceedings BACKGROUND: It has recently become possible to rapidly and accurately detect epigenetic signatures in bacterial genomes using third generation sequencing data. Monitoring the speed at which a single polymerase inserts a base in the read strand enables one to infer whether a modification is present at that specific site on the template strand. These sites can be challenging to detect in the absence of high coverage and reliable reference genomes. METHODS: Here we provide a new method for detecting epigenetic motifs in bacteria on datasets with low-coverage, with incomplete references, and with mixed samples (i.e. metagenomic data). Our approach treats motif inference as a kmer comparison problem. First, genomes (or contigs) are deconstructed into kmers. Then, native genome-wide distributions of interpulse durations (IPDs) for kmers are compared with corresponding whole genome amplified (WGA, modification free) IPD distributions using log likelihood ratios. Finally, kmers are ranked and greedily selected by iteratively correcting for sequences within a particular kmer's neighborhood. CONCLUSIONS: Our method can detect multiple types of modifications, even at very low-coverage and in the presence of mixed genomes. Additionally, we are able to predict modified motifs when genomes with "neighbor" modified motifs exist within the sample. Lastly, we show that these motifs can provide an alternative source of information by which to cluster metagenomics contigs and that iterative refinement on these clustered contigs can further improve both sensitivity and specificity of motif detection. AVAILABILITY: https://github.com/alibashir/EMMCKmer BioMed Central 2014-09-10 /pmc/articles/PMC4168715/ /pubmed/25253358 http://dx.doi.org/10.1186/1471-2105-15-S9-S16 Text en Copyright © 2014 Beckmann et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Beckmann, Noam D
Karri, Sashank
Fang, Gang
Bashir, Ali
Detecting epigenetic motifs in low coverage and metagenomics settings
title Detecting epigenetic motifs in low coverage and metagenomics settings
title_full Detecting epigenetic motifs in low coverage and metagenomics settings
title_fullStr Detecting epigenetic motifs in low coverage and metagenomics settings
title_full_unstemmed Detecting epigenetic motifs in low coverage and metagenomics settings
title_short Detecting epigenetic motifs in low coverage and metagenomics settings
title_sort detecting epigenetic motifs in low coverage and metagenomics settings
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4168715/
https://www.ncbi.nlm.nih.gov/pubmed/25253358
http://dx.doi.org/10.1186/1471-2105-15-S9-S16
work_keys_str_mv AT beckmannnoamd detectingepigeneticmotifsinlowcoverageandmetagenomicssettings
AT karrisashank detectingepigeneticmotifsinlowcoverageandmetagenomicssettings
AT fanggang detectingepigeneticmotifsinlowcoverageandmetagenomicssettings
AT bashirali detectingepigeneticmotifsinlowcoverageandmetagenomicssettings