Cargando…

DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies

Deep learning has revolutionized data science in many fields by greatly improving prediction performances in comparison to conventional approaches. Recently, explainable artificial intelligence has emerged as an area of research that goes beyond pure prediction improvement by extracting knowledge fr...

Descripción completa

Detalles Bibliográficos
Autores principales: Mieth, Bettina, Rozier, Alexandre, Rodriguez, Juan Antonio, Höhne, Marina M C, Görnitz, Nico, Müller, Klaus-Robert
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291080/
https://www.ncbi.nlm.nih.gov/pubmed/34296082
http://dx.doi.org/10.1093/nargab/lqab065
_version_ 1783724578570240000
author Mieth, Bettina
Rozier, Alexandre
Rodriguez, Juan Antonio
Höhne, Marina M C
Görnitz, Nico
Müller, Klaus-Robert
author_facet Mieth, Bettina
Rozier, Alexandre
Rodriguez, Juan Antonio
Höhne, Marina M C
Görnitz, Nico
Müller, Klaus-Robert
author_sort Mieth, Bettina
collection PubMed
description Deep learning has revolutionized data science in many fields by greatly improving prediction performances in comparison to conventional approaches. Recently, explainable artificial intelligence has emerged as an area of research that goes beyond pure prediction improvement by extracting knowledge from deep learning methodologies through the interpretation of their results. We investigate such explanations to explore the genetic architectures of phenotypes in genome-wide association studies. Instead of testing each position in the genome individually, the novel three-step algorithm, called DeepCOMBI, first trains a neural network for the classification of subjects into their respective phenotypes. Second, it explains the classifiers’ decisions by applying layer-wise relevance propagation as one example from the pool of explanation techniques. The resulting importance scores are eventually used to determine a subset of the most relevant locations for multiple hypothesis testing in the third step. The performance of DeepCOMBI in terms of power and precision is investigated on generated datasets and a 2007 study. Verification of the latter is achieved by validating all findings with independent studies published up until 2020. DeepCOMBI is shown to outperform ordinary raw P-value thresholding and other baseline methods. Two novel disease associations (rs10889923 for hypertension, rs4769283 for type 1 diabetes) were identified.
format Online
Article
Text
id pubmed-8291080
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-82910802021-07-21 DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies Mieth, Bettina Rozier, Alexandre Rodriguez, Juan Antonio Höhne, Marina M C Görnitz, Nico Müller, Klaus-Robert NAR Genom Bioinform Standard Article Deep learning has revolutionized data science in many fields by greatly improving prediction performances in comparison to conventional approaches. Recently, explainable artificial intelligence has emerged as an area of research that goes beyond pure prediction improvement by extracting knowledge from deep learning methodologies through the interpretation of their results. We investigate such explanations to explore the genetic architectures of phenotypes in genome-wide association studies. Instead of testing each position in the genome individually, the novel three-step algorithm, called DeepCOMBI, first trains a neural network for the classification of subjects into their respective phenotypes. Second, it explains the classifiers’ decisions by applying layer-wise relevance propagation as one example from the pool of explanation techniques. The resulting importance scores are eventually used to determine a subset of the most relevant locations for multiple hypothesis testing in the third step. The performance of DeepCOMBI in terms of power and precision is investigated on generated datasets and a 2007 study. Verification of the latter is achieved by validating all findings with independent studies published up until 2020. DeepCOMBI is shown to outperform ordinary raw P-value thresholding and other baseline methods. Two novel disease associations (rs10889923 for hypertension, rs4769283 for type 1 diabetes) were identified. Oxford University Press 2021-07-20 /pmc/articles/PMC8291080/ /pubmed/34296082 http://dx.doi.org/10.1093/nargab/lqab065 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Standard Article
Mieth, Bettina
Rozier, Alexandre
Rodriguez, Juan Antonio
Höhne, Marina M C
Görnitz, Nico
Müller, Klaus-Robert
DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title_full DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title_fullStr DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title_full_unstemmed DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title_short DeepCOMBI: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
title_sort deepcombi: explainable artificial intelligence for the analysis and discovery in genome-wide association studies
topic Standard Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291080/
https://www.ncbi.nlm.nih.gov/pubmed/34296082
http://dx.doi.org/10.1093/nargab/lqab065
work_keys_str_mv AT miethbettina deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies
AT rozieralexandre deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies
AT rodriguezjuanantonio deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies
AT hohnemarinamc deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies
AT gornitznico deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies
AT mullerklausrobert deepcombiexplainableartificialintelligencefortheanalysisanddiscoveryingenomewideassociationstudies