Cargando…

Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion

Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and predictio...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Hui, Yu, Xiangtian, Liu, Rui, Zeng, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921615/
https://www.ncbi.nlm.nih.gov/pubmed/35106553
http://dx.doi.org/10.1093/bib/bbab584
_version_ 1784669359331868672
author Tang, Hui
Yu, Xiangtian
Liu, Rui
Zeng, Tao
author_facet Tang, Hui
Yu, Xiangtian
Liu, Rui
Zeng, Tao
author_sort Tang, Hui
collection PubMed
description Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and prediction. To satisfy such demands, we developed Vec2image, an explainable convolutional neural network framework for characterizing the feature engineering, feature selection and classifier training that is mainly based on the collaboration of principal component coordinate conversion, deep residual neural networks and embedded k-nearest neighbor representation on pseudo images of high-dimensional biological data, where the pseudo images represent feature measurements and feature associations simultaneously. Vec2image has achieved better performance compared with other popular methods and illustrated its efficiency on feature selection in cell marker identification from tissue-specific single-cell datasets. In particular, in a case study on type 2 diabetes (T2D) by multiple human islet scRNA-seq datasets, Vec2image first displayed robust performance on T2D classification model building across different datasets, then a specific Vec2image model was trained to accurately recognize the cell state and efficiently rank feature genes relevant to T2D which uncovered potential T2D cellular pathogenesis; and next the cell activity changes, cell composition imbalances and cell–cell communication dysfunctions were associated to our finding T2D feature genes from both population-shared and individual-specific perspectives. Collectively, Vec2image is a new and efficient explainable artificial intelligence methodology that can be widely applied in human-readable classification and prediction on the basis of pseudo image representation of biological deep sequencing data.
format Online
Article
Text
id pubmed-8921615
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-89216152022-03-15 Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion Tang, Hui Yu, Xiangtian Liu, Rui Zeng, Tao Brief Bioinform Case Study Feature representation and discriminative learning are proven models and technologies in artificial intelligence fields; however, major challenges for machine learning on large biological datasets are learning an effective model with mechanistical explanation on the model determination and prediction. To satisfy such demands, we developed Vec2image, an explainable convolutional neural network framework for characterizing the feature engineering, feature selection and classifier training that is mainly based on the collaboration of principal component coordinate conversion, deep residual neural networks and embedded k-nearest neighbor representation on pseudo images of high-dimensional biological data, where the pseudo images represent feature measurements and feature associations simultaneously. Vec2image has achieved better performance compared with other popular methods and illustrated its efficiency on feature selection in cell marker identification from tissue-specific single-cell datasets. In particular, in a case study on type 2 diabetes (T2D) by multiple human islet scRNA-seq datasets, Vec2image first displayed robust performance on T2D classification model building across different datasets, then a specific Vec2image model was trained to accurately recognize the cell state and efficiently rank feature genes relevant to T2D which uncovered potential T2D cellular pathogenesis; and next the cell activity changes, cell composition imbalances and cell–cell communication dysfunctions were associated to our finding T2D feature genes from both population-shared and individual-specific perspectives. Collectively, Vec2image is a new and efficient explainable artificial intelligence methodology that can be widely applied in human-readable classification and prediction on the basis of pseudo image representation of biological deep sequencing data. Oxford University Press 2022-01-31 /pmc/articles/PMC8921615/ /pubmed/35106553 http://dx.doi.org/10.1093/bib/bbab584 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Case Study
Tang, Hui
Yu, Xiangtian
Liu, Rui
Zeng, Tao
Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title_full Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title_fullStr Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title_full_unstemmed Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title_short Vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
title_sort vec2image: an explainable artificial intelligence model for the feature representation and classification of high-dimensional biological data by vector-to-image conversion
topic Case Study
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921615/
https://www.ncbi.nlm.nih.gov/pubmed/35106553
http://dx.doi.org/10.1093/bib/bbab584
work_keys_str_mv AT tanghui vec2imageanexplainableartificialintelligencemodelforthefeaturerepresentationandclassificationofhighdimensionalbiologicaldatabyvectortoimageconversion
AT yuxiangtian vec2imageanexplainableartificialintelligencemodelforthefeaturerepresentationandclassificationofhighdimensionalbiologicaldatabyvectortoimageconversion
AT liurui vec2imageanexplainableartificialintelligencemodelforthefeaturerepresentationandclassificationofhighdimensionalbiologicaldatabyvectortoimageconversion
AT zengtao vec2imageanexplainableartificialintelligencemodelforthefeaturerepresentationandclassificationofhighdimensionalbiologicaldatabyvectortoimageconversion