Cargando…

DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions

BACKGROUND: Genomic variations are associated with the metabolism and the occurrence of adverse reactions of many therapeutic agents. The polymorphisms on over 2000 locations of cytochrome P450 enzymes (CYP) due to many factors such as ethnicity, mutations, and inheritance attribute to the diversity...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Zhaohui, Huang, Jimmy Xiangji, Zeng, Xing, Zhang, Gang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4980789/
https://www.ncbi.nlm.nih.gov/pubmed/27510822
http://dx.doi.org/10.1186/s12920-016-0207-4
_version_ 1782447516314238976
author Liang, Zhaohui
Huang, Jimmy Xiangji
Zeng, Xing
Zhang, Gang
author_facet Liang, Zhaohui
Huang, Jimmy Xiangji
Zeng, Xing
Zhang, Gang
author_sort Liang, Zhaohui
collection PubMed
description BACKGROUND: Genomic variations are associated with the metabolism and the occurrence of adverse reactions of many therapeutic agents. The polymorphisms on over 2000 locations of cytochrome P450 enzymes (CYP) due to many factors such as ethnicity, mutations, and inheritance attribute to the diversity of response and side effects of various drugs. The associations of the single nucleotide polymorphisms (SNPs), the internal pharmacokinetic patterns and the vulnerability of specific adverse reactions become one of the research interests of pharmacogenomics. The conventional genomewide association studies (GWAS) mainly focuses on the relation of single or multiple SNPs to a specific risk factors which are a one-to-many relation. However, there are no robust methods to establish a many-to-many network which can combine the direct and indirect associations between multiple SNPs and a serial of events (e.g. adverse reactions, metabolic patterns, prognostic factors etc.). In this paper, we present a novel deep learning model based on generative stochastic networks and hidden Markov chain to classify the observed samples with SNPs on five loci of two genes (CYP2D6 and CYP1A2) respectively to the vulnerable population of 14 types of adverse reactions. METHODS: A supervised deep learning model is proposed in this study. The revised generative stochastic networks (GSN) model with transited by the hidden Markov chain is used. The data of the training set are collected from clinical observation. The training set is composed of 83 observations of blood samples with the genotypes respectively on CYP2D6*2, *10, *14 and CYP1A2*1C, *1 F. The samples are genotyped by the polymerase chain reaction (PCR) method. A hidden Markov chain is used as the transition operator to simulate the probabilistic distribution. The model can perform learning at lower cost compared to the conventional maximal likelihood method because the transition distribution is conditional on the previous state of the hidden Markov chain. A least square loss (LASSO) algorithm and a k-Nearest Neighbors (kNN) algorithm are used as the baselines for comparison and to evaluate the performance of our proposed deep learning model. RESULTS: There are 53 adverse reactions reported during the observation. They are assigned to 14 categories. In the comparison of classification accuracy, the deep learning model shows superiority over the LASSO and kNN model with a rate over 80 %. In the comparison of reliability, the deep learning model shows the best stability among the three models. CONCLUSIONS: Machine learning provides a new method to explore the complex associations among genomic variations and multiple events in pharmacogenomics studies. The new deep learning algorithm is capable of classifying various SNPs to the corresponding adverse reactions. We expect that as more genomic variations are added as features and more observations are made, the deep learning model can improve its performance and can act as a black-box but reliable verifier for other GWAS studies.
format Online
Article
Text
id pubmed-4980789
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-49807892016-08-19 DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions Liang, Zhaohui Huang, Jimmy Xiangji Zeng, Xing Zhang, Gang BMC Med Genomics Research BACKGROUND: Genomic variations are associated with the metabolism and the occurrence of adverse reactions of many therapeutic agents. The polymorphisms on over 2000 locations of cytochrome P450 enzymes (CYP) due to many factors such as ethnicity, mutations, and inheritance attribute to the diversity of response and side effects of various drugs. The associations of the single nucleotide polymorphisms (SNPs), the internal pharmacokinetic patterns and the vulnerability of specific adverse reactions become one of the research interests of pharmacogenomics. The conventional genomewide association studies (GWAS) mainly focuses on the relation of single or multiple SNPs to a specific risk factors which are a one-to-many relation. However, there are no robust methods to establish a many-to-many network which can combine the direct and indirect associations between multiple SNPs and a serial of events (e.g. adverse reactions, metabolic patterns, prognostic factors etc.). In this paper, we present a novel deep learning model based on generative stochastic networks and hidden Markov chain to classify the observed samples with SNPs on five loci of two genes (CYP2D6 and CYP1A2) respectively to the vulnerable population of 14 types of adverse reactions. METHODS: A supervised deep learning model is proposed in this study. The revised generative stochastic networks (GSN) model with transited by the hidden Markov chain is used. The data of the training set are collected from clinical observation. The training set is composed of 83 observations of blood samples with the genotypes respectively on CYP2D6*2, *10, *14 and CYP1A2*1C, *1 F. The samples are genotyped by the polymerase chain reaction (PCR) method. A hidden Markov chain is used as the transition operator to simulate the probabilistic distribution. The model can perform learning at lower cost compared to the conventional maximal likelihood method because the transition distribution is conditional on the previous state of the hidden Markov chain. A least square loss (LASSO) algorithm and a k-Nearest Neighbors (kNN) algorithm are used as the baselines for comparison and to evaluate the performance of our proposed deep learning model. RESULTS: There are 53 adverse reactions reported during the observation. They are assigned to 14 categories. In the comparison of classification accuracy, the deep learning model shows superiority over the LASSO and kNN model with a rate over 80 %. In the comparison of reliability, the deep learning model shows the best stability among the three models. CONCLUSIONS: Machine learning provides a new method to explore the complex associations among genomic variations and multiple events in pharmacogenomics studies. The new deep learning algorithm is capable of classifying various SNPs to the corresponding adverse reactions. We expect that as more genomic variations are added as features and more observations are made, the deep learning model can improve its performance and can act as a black-box but reliable verifier for other GWAS studies. BioMed Central 2016-08-10 /pmc/articles/PMC4980789/ /pubmed/27510822 http://dx.doi.org/10.1186/s12920-016-0207-4 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liang, Zhaohui
Huang, Jimmy Xiangji
Zeng, Xing
Zhang, Gang
DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title_full DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title_fullStr DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title_full_unstemmed DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title_short DL-ADR: a novel deep learning model for classifying genomic variants into adverse drug reactions
title_sort dl-adr: a novel deep learning model for classifying genomic variants into adverse drug reactions
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4980789/
https://www.ncbi.nlm.nih.gov/pubmed/27510822
http://dx.doi.org/10.1186/s12920-016-0207-4
work_keys_str_mv AT liangzhaohui dladranoveldeeplearningmodelforclassifyinggenomicvariantsintoadversedrugreactions
AT huangjimmyxiangji dladranoveldeeplearningmodelforclassifyinggenomicvariantsintoadversedrugreactions
AT zengxing dladranoveldeeplearningmodelforclassifyinggenomicvariantsintoadversedrugreactions
AT zhanggang dladranoveldeeplearningmodelforclassifyinggenomicvariantsintoadversedrugreactions