Cargando…

LanceOtron: a deep learning peak caller for genome sequencing experiments

MOTIVATION: Genome sequencing experiments have revolutionized molecular biology by allowing researchers to identify important DNA-encoded elements genome wide. Regions where these elements are found appear as peaks in the analog signal of an assay’s coverage track, and despite the ease with which hu...

Descripción completa

Detalles Bibliográficos
Autores principales: Hentges, Lance D, Sergeant, Martin J, Cole, Christopher B, Downes, Damien J, Hughes, Jim R, Taylor, Stephen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9477537/
https://www.ncbi.nlm.nih.gov/pubmed/35866989
http://dx.doi.org/10.1093/bioinformatics/btac525
_version_ 1784790383533752320
author Hentges, Lance D
Sergeant, Martin J
Cole, Christopher B
Downes, Damien J
Hughes, Jim R
Taylor, Stephen
author_facet Hentges, Lance D
Sergeant, Martin J
Cole, Christopher B
Downes, Damien J
Hughes, Jim R
Taylor, Stephen
author_sort Hentges, Lance D
collection PubMed
description MOTIVATION: Genome sequencing experiments have revolutionized molecular biology by allowing researchers to identify important DNA-encoded elements genome wide. Regions where these elements are found appear as peaks in the analog signal of an assay’s coverage track, and despite the ease with which humans can visually categorize these patterns, the size of many genomes necessitates algorithmic implementations. Commonly used methods focus on statistical tests to classify peaks, discounting that the background signal does not completely follow any known probability distribution and reducing the information-dense peak shapes to simply maximum height. Deep learning has been shown to be highly accurate for many pattern recognition tasks, on par or even exceeding human capabilities, providing an opportunity to reimagine and improve peak calling. RESULTS: We present the peak calling framework LanceOtron, which combines deep learning for recognizing peak shape with multifaceted enrichment calculations for assessing significance. In benchmarking ATAC-seq, ChIP-seq and DNase-seq, LanceOtron outperforms long-standing, gold-standard peak callers through its improved selectivity and near-perfect sensitivity. AVAILABILITY AND IMPLEMENTATION: A fully featured web application is freely available from LanceOtron.molbiol.ox.ac.uk, command line interface via python is pip installable from PyPI at https://pypi.org/project/lanceotron/, and source code and benchmarking tests are available at https://github.com/LHentges/LanceOtron. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9477537
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-94775372022-09-19 LanceOtron: a deep learning peak caller for genome sequencing experiments Hentges, Lance D Sergeant, Martin J Cole, Christopher B Downes, Damien J Hughes, Jim R Taylor, Stephen Bioinformatics Original Papers MOTIVATION: Genome sequencing experiments have revolutionized molecular biology by allowing researchers to identify important DNA-encoded elements genome wide. Regions where these elements are found appear as peaks in the analog signal of an assay’s coverage track, and despite the ease with which humans can visually categorize these patterns, the size of many genomes necessitates algorithmic implementations. Commonly used methods focus on statistical tests to classify peaks, discounting that the background signal does not completely follow any known probability distribution and reducing the information-dense peak shapes to simply maximum height. Deep learning has been shown to be highly accurate for many pattern recognition tasks, on par or even exceeding human capabilities, providing an opportunity to reimagine and improve peak calling. RESULTS: We present the peak calling framework LanceOtron, which combines deep learning for recognizing peak shape with multifaceted enrichment calculations for assessing significance. In benchmarking ATAC-seq, ChIP-seq and DNase-seq, LanceOtron outperforms long-standing, gold-standard peak callers through its improved selectivity and near-perfect sensitivity. AVAILABILITY AND IMPLEMENTATION: A fully featured web application is freely available from LanceOtron.molbiol.ox.ac.uk, command line interface via python is pip installable from PyPI at https://pypi.org/project/lanceotron/, and source code and benchmarking tests are available at https://github.com/LHentges/LanceOtron. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2022-07-22 /pmc/articles/PMC9477537/ /pubmed/35866989 http://dx.doi.org/10.1093/bioinformatics/btac525 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Hentges, Lance D
Sergeant, Martin J
Cole, Christopher B
Downes, Damien J
Hughes, Jim R
Taylor, Stephen
LanceOtron: a deep learning peak caller for genome sequencing experiments
title LanceOtron: a deep learning peak caller for genome sequencing experiments
title_full LanceOtron: a deep learning peak caller for genome sequencing experiments
title_fullStr LanceOtron: a deep learning peak caller for genome sequencing experiments
title_full_unstemmed LanceOtron: a deep learning peak caller for genome sequencing experiments
title_short LanceOtron: a deep learning peak caller for genome sequencing experiments
title_sort lanceotron: a deep learning peak caller for genome sequencing experiments
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9477537/
https://www.ncbi.nlm.nih.gov/pubmed/35866989
http://dx.doi.org/10.1093/bioinformatics/btac525
work_keys_str_mv AT hentgeslanced lanceotronadeeplearningpeakcallerforgenomesequencingexperiments
AT sergeantmartinj lanceotronadeeplearningpeakcallerforgenomesequencingexperiments
AT colechristopherb lanceotronadeeplearningpeakcallerforgenomesequencingexperiments
AT downesdamienj lanceotronadeeplearningpeakcallerforgenomesequencingexperiments
AT hughesjimr lanceotronadeeplearningpeakcallerforgenomesequencingexperiments
AT taylorstephen lanceotronadeeplearningpeakcallerforgenomesequencingexperiments