Cargando…

MetaTransformer: deep metagenomic sequencing read classification using self-attention models

Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yieldin...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wichmann, Alexander, Buschong, Etienne, Müller, André, Jünger, Daniel, Hildebrandt, Andreas, Hankeln, Thomas, Schmidt, Bertil
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Methods Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495543/ https://www.ncbi.nlm.nih.gov/pubmed/37705831 http://dx.doi.org/10.1093/nargab/lqad082

_version_	1785104919416537088
author	Wichmann, Alexander Buschong, Etienne Müller, André Jünger, Daniel Hildebrandt, Andreas Hankeln, Thomas Schmidt, Bertil
author_facet	Wichmann, Alexander Buschong, Etienne Müller, André Jünger, Daniel Hildebrandt, Andreas Hankeln, Thomas Schmidt, Bertil
author_sort	Wichmann, Alexander
collection	PubMed
description	Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2× to 5× speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data.
format	Online Article Text
id	pubmed-10495543
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-104955432023-09-13 MetaTransformer: deep metagenomic sequencing read classification using self-attention models Wichmann, Alexander Buschong, Etienne Müller, André Jünger, Daniel Hildebrandt, Andreas Hankeln, Thomas Schmidt, Bertil NAR Genom Bioinform Methods Article Deep learning has emerged as a paradigm that revolutionizes numerous domains of scientific research. Transformers have been utilized in language modeling outperforming previous approaches. Therefore, the utilization of deep learning as a tool for analyzing the genomic sequences is promising, yielding convincing results in fields such as motif identification and variant calling. DeepMicrobes, a machine learning-based classifier, has recently been introduced for taxonomic prediction at species and genus level. However, it relies on complex models based on bidirectional long short-term memory cells resulting in slow runtimes and excessive memory requirements, hampering its effective usability. We present MetaTransformer, a self-attention-based deep learning metagenomic analysis tool. Our transformer-encoder-based models enable efficient parallelization while outperforming DeepMicrobes in terms of species and genus classification abilities. Furthermore, we investigate approaches to reduce memory consumption and boost performance using different embedding schemes. As a result, we are able to achieve 2× to 5× speedup for inference compared to DeepMicrobes while keeping a significantly smaller memory footprint. MetaTransformer can be trained in 9 hours for genus and 16 hours for species prediction. Our results demonstrate performance improvements due to self-attention models and the impact of embedding schemes in deep learning on metagenomic sequencing data. Oxford University Press 2023-09-11 /pmc/articles/PMC10495543/ /pubmed/37705831 http://dx.doi.org/10.1093/nargab/lqad082 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methods Article Wichmann, Alexander Buschong, Etienne Müller, André Jünger, Daniel Hildebrandt, Andreas Hankeln, Thomas Schmidt, Bertil MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title	MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title_full	MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title_fullStr	MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title_full_unstemmed	MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title_short	MetaTransformer: deep metagenomic sequencing read classification using self-attention models
title_sort	metatransformer: deep metagenomic sequencing read classification using self-attention models
topic	Methods Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495543/ https://www.ncbi.nlm.nih.gov/pubmed/37705831 http://dx.doi.org/10.1093/nargab/lqad082
work_keys_str_mv	AT wichmannalexander metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT buschongetienne metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT mullerandre metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT jungerdaniel metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT hildebrandtandreas metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT hankelnthomas metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels AT schmidtbertil metatransformerdeepmetagenomicsequencingreadclassificationusingselfattentionmodels

MetaTransformer: deep metagenomic sequencing read classification using self-attention models

Ejemplares similares