Cargando…

Sequence-specific bias correction for RNA-seq data using recurrent neural networks

BACKGROUND: The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yao-zhong, Yamaguchi, Rui, Imoto, Seiya, Miyano, Satoru
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5310274/
https://www.ncbi.nlm.nih.gov/pubmed/28198674
http://dx.doi.org/10.1186/s12864-016-3262-5
_version_ 1782507843316875264
author Zhang, Yao-zhong
Yamaguchi, Rui
Imoto, Seiya
Miyano, Satoru
author_facet Zhang, Yao-zhong
Yamaguchi, Rui
Imoto, Seiya
Miyano, Satoru
author_sort Zhang, Yao-zhong
collection PubMed
description BACKGROUND: The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. RESULT: We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. CONCLUSTIONS: RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3262-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5310274
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53102742017-02-22 Sequence-specific bias correction for RNA-seq data using recurrent neural networks Zhang, Yao-zhong Yamaguchi, Rui Imoto, Seiya Miyano, Satoru BMC Genomics Research BACKGROUND: The recent success of deep learning techniques in machine learning and artificial intelligence has stimulated a great deal of interest among bioinformaticians, who now wish to bring the power of deep learning to bare on a host of bioinformatical problems. Deep learning is ideally suited for biological problems that require automatic or hierarchical feature representation for biological data when prior knowledge is limited. In this work, we address the sequence-specific bias correction problem for RNA-seq data redusing Recurrent Neural Networks (RNNs) to model nucleotide sequences without pre-determining sequence structures. The sequence-specific bias of a read is then calculated based on the sequence probabilities estimated by RNNs, and used in the estimation of gene abundance. RESULT: We explore the application of two popular RNN recurrent units for this task and demonstrate that RNN-based approaches provide a flexible way to model nucleotide sequences without knowledge of predetermined sequence structures. Our experiments show that training a RNN-based nucleotide sequence model is efficient and RNN-based bias correction methods compare well with the-state-of-the-art sequence-specific bias correction method on the commonly used MAQC-III data set. CONCLUSTIONS: RNNs provides an alternative and flexible way to calculate sequence-specific bias without explicitly pre-determining sequence structures. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3262-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-01-25 /pmc/articles/PMC5310274/ /pubmed/28198674 http://dx.doi.org/10.1186/s12864-016-3262-5 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Zhang, Yao-zhong
Yamaguchi, Rui
Imoto, Seiya
Miyano, Satoru
Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title_full Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title_fullStr Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title_full_unstemmed Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title_short Sequence-specific bias correction for RNA-seq data using recurrent neural networks
title_sort sequence-specific bias correction for rna-seq data using recurrent neural networks
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5310274/
https://www.ncbi.nlm.nih.gov/pubmed/28198674
http://dx.doi.org/10.1186/s12864-016-3262-5
work_keys_str_mv AT zhangyaozhong sequencespecificbiascorrectionforrnaseqdatausingrecurrentneuralnetworks
AT yamaguchirui sequencespecificbiascorrectionforrnaseqdatausingrecurrentneuralnetworks
AT imotoseiya sequencespecificbiascorrectionforrnaseqdatausingrecurrentneuralnetworks
AT miyanosatoru sequencespecificbiascorrectionforrnaseqdatausingrecurrentneuralnetworks