Cargando…

Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification

MOTIVATION: Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnosti...

Descripción completa

Detalles Bibliográficos
Autores principales: Du, Lei, Liu, Fang, Liu, Kefei, Yao, Xiaohui, Risacher, Shannon L, Han, Junwei, Guo, Lei, Saykin, Andrew J, Shen, Li
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355274/
https://www.ncbi.nlm.nih.gov/pubmed/32657360
http://dx.doi.org/10.1093/bioinformatics/btaa434
_version_ 1783558242424586240
author Du, Lei
Liu, Fang
Liu, Kefei
Yao, Xiaohui
Risacher, Shannon L
Han, Junwei
Guo, Lei
Saykin, Andrew J
Shen, Li
author_facet Du, Lei
Liu, Fang
Liu, Kefei
Yao, Xiaohui
Risacher, Shannon L
Han, Junwei
Guo, Lei
Saykin, Andrew J
Shen, Li
author_sort Du, Lei
collection PubMed
description MOTIVATION: Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype–phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype–phenotype associations. RESULTS: In this article, we propose a new joint multitask learning method, named MT–SCCALR, which absorbs the merits of both SCCA and logistic regression. MT–SCCALR learns genotype–phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype–phenotype pattern. Meanwhile, MT–SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT–SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype–phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. AVAILABILITY AND IMPLEMENTATION: The software is publicly available at https://github.com/dulei323/MTSCCALR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355274
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552742020-07-16 Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification Du, Lei Liu, Fang Liu, Kefei Yao, Xiaohui Risacher, Shannon L Han, Junwei Guo, Lei Saykin, Andrew J Shen, Li Bioinformatics Studies of Phenotypes and Clinical Applications MOTIVATION: Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype–phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype–phenotype associations. RESULTS: In this article, we propose a new joint multitask learning method, named MT–SCCALR, which absorbs the merits of both SCCA and logistic regression. MT–SCCALR learns genotype–phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype–phenotype pattern. Meanwhile, MT–SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT–SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype–phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. AVAILABILITY AND IMPLEMENTATION: The software is publicly available at https://github.com/dulei323/MTSCCALR. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355274/ /pubmed/32657360 http://dx.doi.org/10.1093/bioinformatics/btaa434 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Studies of Phenotypes and Clinical Applications
Du, Lei
Liu, Fang
Liu, Kefei
Yao, Xiaohui
Risacher, Shannon L
Han, Junwei
Guo, Lei
Saykin, Andrew J
Shen, Li
Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title_full Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title_fullStr Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title_full_unstemmed Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title_short Identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
title_sort identifying diagnosis-specific genotype–phenotype associations via joint multitask sparse canonical correlation analysis and classification
topic Studies of Phenotypes and Clinical Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355274/
https://www.ncbi.nlm.nih.gov/pubmed/32657360
http://dx.doi.org/10.1093/bioinformatics/btaa434
work_keys_str_mv AT dulei identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT liufang identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT liukefei identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT yaoxiaohui identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT risachershannonl identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT hanjunwei identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT guolei identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT saykinandrewj identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT shenli identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification
AT identifyingdiagnosisspecificgenotypephenotypeassociationsviajointmultitasksparsecanonicalcorrelationanalysisandclassification