Cargando…

Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer

Minimizers are widely used to sample representative k-mers from biological sequences in many applications, such as read mapping and taxonomy prediction. In most scenarios, having the minimizer scheme select as few k-mer positions as possible (i.e., having a low density) is desirable to reduce comput...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hoang, Minh, Zheng, Hongyu, Kingsford, Carl
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Mary Ann Liebert, Inc., publishers 2022
Materias:	Research Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807081/ https://www.ncbi.nlm.nih.gov/pubmed/36095142 http://dx.doi.org/10.1089/cmb.2022.0275

_version_	1784862640361701376
author	Hoang, Minh Zheng, Hongyu Kingsford, Carl
author_facet	Hoang, Minh Zheng, Hongyu Kingsford, Carl
author_sort	Hoang, Minh
collection	PubMed
description	Minimizers are widely used to sample representative k-mers from biological sequences in many applications, such as read mapping and taxonomy prediction. In most scenarios, having the minimizer scheme select as few k-mer positions as possible (i.e., having a low density) is desirable to reduce computation and memory cost. Despite the growing interest in minimizers, learning an effective scheme with optimal density is still an open question, as it requires solving an apparently challenging discrete optimization problem on the permutation space of k-mer orderings. Most existing schemes are designed to work well in expectation over random sequences, which have limited applicability to many practical tools. On the other hand, several methods have been proposed to construct minimizer schemes for a specific target sequence. These methods, however, only approximate the original objective with likewise discrete surrogate tasks that are not able to significantly improve the density performance. This article introduces the first continuous relaxation of the density minimizing objective, DeepMinimizer, which employs a novel Deep Learning twin architecture to simultaneously ensure both validity and performance of the minimizer scheme. Our surrogate objective is fully differentiable and, therefore, amenable to efficient gradient-based optimization using GPU computing. Finally, we demonstrate that DeepMinimizer discovers minimizer schemes that significantly outperform state-of-the-art constructions on human genomic sequences.
format	Online Article Text
id	pubmed-9807081
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Mary Ann Liebert, Inc., publishers
record_format	MEDLINE/PubMed
spelling	pubmed-98070812023-01-10 Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer Hoang, Minh Zheng, Hongyu Kingsford, Carl J Comput Biol Research Articles Minimizers are widely used to sample representative k-mers from biological sequences in many applications, such as read mapping and taxonomy prediction. In most scenarios, having the minimizer scheme select as few k-mer positions as possible (i.e., having a low density) is desirable to reduce computation and memory cost. Despite the growing interest in minimizers, learning an effective scheme with optimal density is still an open question, as it requires solving an apparently challenging discrete optimization problem on the permutation space of k-mer orderings. Most existing schemes are designed to work well in expectation over random sequences, which have limited applicability to many practical tools. On the other hand, several methods have been proposed to construct minimizer schemes for a specific target sequence. These methods, however, only approximate the original objective with likewise discrete surrogate tasks that are not able to significantly improve the density performance. This article introduces the first continuous relaxation of the density minimizing objective, DeepMinimizer, which employs a novel Deep Learning twin architecture to simultaneously ensure both validity and performance of the minimizer scheme. Our surrogate objective is fully differentiable and, therefore, amenable to efficient gradient-based optimization using GPU computing. Finally, we demonstrate that DeepMinimizer discovers minimizer schemes that significantly outperform state-of-the-art constructions on human genomic sequences. Mary Ann Liebert, Inc., publishers 2022-12-01 2022-12-13 /pmc/articles/PMC9807081/ /pubmed/36095142 http://dx.doi.org/10.1089/cmb.2022.0275 Text en © Minh Hoang, et al., 2022. Published by Mary Ann Liebert, Inc. https://creativecommons.org/licenses/by/4.0/This Open Access article is distributed under the terms of the Creative Commons License [CC-BY] (http://creativecommons.org/licenses/by/4.0 (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Articles Hoang, Minh Zheng, Hongyu Kingsford, Carl Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title	Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title_full	Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title_fullStr	Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title_full_unstemmed	Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title_short	Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer
title_sort	differentiable learning of sequence-specific minimizer schemes with deepminimizer
topic	Research Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807081/ https://www.ncbi.nlm.nih.gov/pubmed/36095142 http://dx.doi.org/10.1089/cmb.2022.0275
work_keys_str_mv	AT hoangminh differentiablelearningofsequencespecificminimizerschemeswithdeepminimizer AT zhenghongyu differentiablelearningofsequencespecificminimizerschemeswithdeepminimizer AT kingsfordcarl differentiablelearningofsequencespecificminimizerschemeswithdeepminimizer

Differentiable Learning of Sequence-Specific Minimizer Schemes with DeepMinimizer

Ejemplares similares