Cargando…

Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms

BACKGROUND: Transcriptome sequencing has been broadly available in clinical studies. However, it remains a challenge to utilize these data effectively for clinical applications due to the high dimension of the data and the highly correlated expression between individual genes. METHODS: We proposed a...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Xiangning, Chen, Daniel G., Zhao, Zhongming, Balko, Justin M., Chen, Jingchun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504079/
https://www.ncbi.nlm.nih.gov/pubmed/34629099
http://dx.doi.org/10.1186/s13058-021-01474-z
_version_ 1784581258729226240
author Chen, Xiangning
Chen, Daniel G.
Zhao, Zhongming
Balko, Justin M.
Chen, Jingchun
author_facet Chen, Xiangning
Chen, Daniel G.
Zhao, Zhongming
Balko, Justin M.
Chen, Jingchun
author_sort Chen, Xiangning
collection PubMed
description BACKGROUND: Transcriptome sequencing has been broadly available in clinical studies. However, it remains a challenge to utilize these data effectively for clinical applications due to the high dimension of the data and the highly correlated expression between individual genes. METHODS: We proposed a method to transform RNA sequencing data into artificial image objects (AIOs) and applied convolutional neural network (CNN) algorithms to classify these AIOs. With the AIO technique, we considered each gene as a pixel in an image and its expression level as pixel intensity. Using the GSE96058 (n = 2976), GSE81538 (n = 405), and GSE163882 (n = 222) datasets, we created AIOs for the subjects and designed CNN models to classify biomarker Ki67 and Nottingham histologic grade (NHG). RESULTS: With fivefold cross-validation, we accomplished a classification accuracy and AUC of 0.821 ± 0.023 and 0.891 ± 0.021 for Ki67 status. For NHG, the weighted average of categorical accuracy was 0.820 ± 0.012, and the weighted average of AUC was 0.931 ± 0.006. With GSE96058 as training data and GSE81538 as testing data, the accuracy and AUC for Ki67 were 0.826 ± 0.037 and 0.883 ± 0.016, and that for NHG were 0.764 ± 0.052 and 0.882 ± 0.012, respectively. These results were 10% better than the results reported in the original studies. For Ki67, the calls generated from our models had a better power for prediction of survival as compared to the calls from trained pathologists in survival analyses. CONCLUSIONS: We demonstrated that RNA sequencing data could be transformed into AIOs and be used to classify Ki67 status and NHG with CNN algorithms. The AIO method could handle high-dimensional data with highly correlated variables, and there was no need for variable selection. With the AIO technique, a data-driven, consistent, and automation-ready model could be developed to classify biomarkers with RNA sequencing data and provide more efficient care for cancer patients. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13058-021-01474-z.
format Online
Article
Text
id pubmed-8504079
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-85040792021-10-25 Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms Chen, Xiangning Chen, Daniel G. Zhao, Zhongming Balko, Justin M. Chen, Jingchun Breast Cancer Res Research Article BACKGROUND: Transcriptome sequencing has been broadly available in clinical studies. However, it remains a challenge to utilize these data effectively for clinical applications due to the high dimension of the data and the highly correlated expression between individual genes. METHODS: We proposed a method to transform RNA sequencing data into artificial image objects (AIOs) and applied convolutional neural network (CNN) algorithms to classify these AIOs. With the AIO technique, we considered each gene as a pixel in an image and its expression level as pixel intensity. Using the GSE96058 (n = 2976), GSE81538 (n = 405), and GSE163882 (n = 222) datasets, we created AIOs for the subjects and designed CNN models to classify biomarker Ki67 and Nottingham histologic grade (NHG). RESULTS: With fivefold cross-validation, we accomplished a classification accuracy and AUC of 0.821 ± 0.023 and 0.891 ± 0.021 for Ki67 status. For NHG, the weighted average of categorical accuracy was 0.820 ± 0.012, and the weighted average of AUC was 0.931 ± 0.006. With GSE96058 as training data and GSE81538 as testing data, the accuracy and AUC for Ki67 were 0.826 ± 0.037 and 0.883 ± 0.016, and that for NHG were 0.764 ± 0.052 and 0.882 ± 0.012, respectively. These results were 10% better than the results reported in the original studies. For Ki67, the calls generated from our models had a better power for prediction of survival as compared to the calls from trained pathologists in survival analyses. CONCLUSIONS: We demonstrated that RNA sequencing data could be transformed into AIOs and be used to classify Ki67 status and NHG with CNN algorithms. The AIO method could handle high-dimensional data with highly correlated variables, and there was no need for variable selection. With the AIO technique, a data-driven, consistent, and automation-ready model could be developed to classify biomarkers with RNA sequencing data and provide more efficient care for cancer patients. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13058-021-01474-z. BioMed Central 2021-10-10 2021 /pmc/articles/PMC8504079/ /pubmed/34629099 http://dx.doi.org/10.1186/s13058-021-01474-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Chen, Xiangning
Chen, Daniel G.
Zhao, Zhongming
Balko, Justin M.
Chen, Jingchun
Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title_full Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title_fullStr Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title_full_unstemmed Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title_short Artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
title_sort artificial image objects for classification of breast cancer biomarkers with transcriptome sequencing data and convolutional neural network algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8504079/
https://www.ncbi.nlm.nih.gov/pubmed/34629099
http://dx.doi.org/10.1186/s13058-021-01474-z
work_keys_str_mv AT chenxiangning artificialimageobjectsforclassificationofbreastcancerbiomarkerswithtranscriptomesequencingdataandconvolutionalneuralnetworkalgorithms
AT chendanielg artificialimageobjectsforclassificationofbreastcancerbiomarkerswithtranscriptomesequencingdataandconvolutionalneuralnetworkalgorithms
AT zhaozhongming artificialimageobjectsforclassificationofbreastcancerbiomarkerswithtranscriptomesequencingdataandconvolutionalneuralnetworkalgorithms
AT balkojustinm artificialimageobjectsforclassificationofbreastcancerbiomarkerswithtranscriptomesequencingdataandconvolutionalneuralnetworkalgorithms
AT chenjingchun artificialimageobjectsforclassificationofbreastcancerbiomarkerswithtranscriptomesequencingdataandconvolutionalneuralnetworkalgorithms