Cargando…

A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures

COVID-19 is a newly recognized illness with a predominantly respiratory presentation. Although initial analyses have identified groups of candidate gene biomarkers for the diagnosis of COVID-19, they have yet to identify clinically applicable biomarkers, so we need disease-specific diagnostic biomar...

Descripción completa

Detalles Bibliográficos
Autores principales: Momeni, Maryam, Rashidifar, Maryam, Balam, Farinaz Hosseini, Roointan, Amir, Gholaminejad, Alieh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10075178/
https://www.ncbi.nlm.nih.gov/pubmed/37019895
http://dx.doi.org/10.1038/s41598-023-32268-2
_version_ 1785019868626550784
author Momeni, Maryam
Rashidifar, Maryam
Balam, Farinaz Hosseini
Roointan, Amir
Gholaminejad, Alieh
author_facet Momeni, Maryam
Rashidifar, Maryam
Balam, Farinaz Hosseini
Roointan, Amir
Gholaminejad, Alieh
author_sort Momeni, Maryam
collection PubMed
description COVID-19 is a newly recognized illness with a predominantly respiratory presentation. Although initial analyses have identified groups of candidate gene biomarkers for the diagnosis of COVID-19, they have yet to identify clinically applicable biomarkers, so we need disease-specific diagnostic biomarkers in biofluid and differential diagnosis in comparison with other infectious diseases. This can further increase knowledge of pathogenesis and help guide treatment. Eight transcriptomic profiles of COVID-19 infected versus control samples from peripheral blood (PB), lung tissue, nasopharyngeal swab and bronchoalveolar lavage fluid (BALF) were considered. In order to find COVID-19 potential Specific Blood Differentially expressed genes (SpeBDs), we implemented a strategy based on finding shared pathways of peripheral blood and the most involved tissues in COVID-19 patients. This step was performed to filter blood DEGs with a role in the shared pathways. Furthermore, nine datasets of the three types of Influenza (H1N1, H3N2, and B) were used for the second step. Potential Differential Blood DEGs of COVID-19 versus Influenza (DifBDs) were found by extracting DEGs involved in only enriched pathways by SpeBDs and not by Influenza DEGs. Then in the third step, a machine learning method (a wrapper feature selection approach supervised by four classifiers of k-NN, Random Forest, SVM, Naïve Bayes) was utilized to narrow down the number of SpeBDs and DifBDs and find the most predictive combination of them to select COVID-19 potential Specific Blood Biomarker Signatures (SpeBBSs) and COVID-19 versus influenza Differential Blood Biomarker Signatures (DifBBSs), respectively. After that, models based on SpeBBSs and DifBBSs and the corresponding algorithms were built to assess their performance on an external dataset. Among all the extracted DEGs from the PB dataset (from common PB pathways with BALF, Lung and Swab), 108 unique SpeBD were obtained. Feature selection using Random Forest outperformed its counterparts and selected IGKC, IGLV3-16 and SRP9 among SpeBDs as SpeBBSs. Validation of the constructed model based on these genes and Random Forest on an external dataset resulted in 93.09% Accuracy. Eighty-three pathways enriched by SpeBDs and not by any of the influenza strains were identified, including 87 DifBDs. Using feature selection by Naive Bayes classifier on DifBDs, FMNL2, IGHV3-23, IGLV2-11 and RPL31 were selected as the most predictable DifBBSs. The constructed model based on these genes and Naive Bayes on an external dataset was validated with 87.2% accuracy. Our study identified several candidate blood biomarkers for a potential specific and differential diagnosis of COVID-19. The proposed biomarkers could be valuable targets for practical investigations to validate their potential.
format Online
Article
Text
id pubmed-10075178
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100751782023-04-06 A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures Momeni, Maryam Rashidifar, Maryam Balam, Farinaz Hosseini Roointan, Amir Gholaminejad, Alieh Sci Rep Article COVID-19 is a newly recognized illness with a predominantly respiratory presentation. Although initial analyses have identified groups of candidate gene biomarkers for the diagnosis of COVID-19, they have yet to identify clinically applicable biomarkers, so we need disease-specific diagnostic biomarkers in biofluid and differential diagnosis in comparison with other infectious diseases. This can further increase knowledge of pathogenesis and help guide treatment. Eight transcriptomic profiles of COVID-19 infected versus control samples from peripheral blood (PB), lung tissue, nasopharyngeal swab and bronchoalveolar lavage fluid (BALF) were considered. In order to find COVID-19 potential Specific Blood Differentially expressed genes (SpeBDs), we implemented a strategy based on finding shared pathways of peripheral blood and the most involved tissues in COVID-19 patients. This step was performed to filter blood DEGs with a role in the shared pathways. Furthermore, nine datasets of the three types of Influenza (H1N1, H3N2, and B) were used for the second step. Potential Differential Blood DEGs of COVID-19 versus Influenza (DifBDs) were found by extracting DEGs involved in only enriched pathways by SpeBDs and not by Influenza DEGs. Then in the third step, a machine learning method (a wrapper feature selection approach supervised by four classifiers of k-NN, Random Forest, SVM, Naïve Bayes) was utilized to narrow down the number of SpeBDs and DifBDs and find the most predictive combination of them to select COVID-19 potential Specific Blood Biomarker Signatures (SpeBBSs) and COVID-19 versus influenza Differential Blood Biomarker Signatures (DifBBSs), respectively. After that, models based on SpeBBSs and DifBBSs and the corresponding algorithms were built to assess their performance on an external dataset. Among all the extracted DEGs from the PB dataset (from common PB pathways with BALF, Lung and Swab), 108 unique SpeBD were obtained. Feature selection using Random Forest outperformed its counterparts and selected IGKC, IGLV3-16 and SRP9 among SpeBDs as SpeBBSs. Validation of the constructed model based on these genes and Random Forest on an external dataset resulted in 93.09% Accuracy. Eighty-three pathways enriched by SpeBDs and not by any of the influenza strains were identified, including 87 DifBDs. Using feature selection by Naive Bayes classifier on DifBDs, FMNL2, IGHV3-23, IGLV2-11 and RPL31 were selected as the most predictable DifBBSs. The constructed model based on these genes and Naive Bayes on an external dataset was validated with 87.2% accuracy. Our study identified several candidate blood biomarkers for a potential specific and differential diagnosis of COVID-19. The proposed biomarkers could be valuable targets for practical investigations to validate their potential. Nature Publishing Group UK 2023-04-05 /pmc/articles/PMC10075178/ /pubmed/37019895 http://dx.doi.org/10.1038/s41598-023-32268-2 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Momeni, Maryam
Rashidifar, Maryam
Balam, Farinaz Hosseini
Roointan, Amir
Gholaminejad, Alieh
A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title_full A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title_fullStr A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title_full_unstemmed A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title_short A comprehensive analysis of gene expression profiling data in COVID-19 patients for discovery of specific and differential blood biomarker signatures
title_sort comprehensive analysis of gene expression profiling data in covid-19 patients for discovery of specific and differential blood biomarker signatures
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10075178/
https://www.ncbi.nlm.nih.gov/pubmed/37019895
http://dx.doi.org/10.1038/s41598-023-32268-2
work_keys_str_mv AT momenimaryam acomprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT rashidifarmaryam acomprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT balamfarinazhosseini acomprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT roointanamir acomprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT gholaminejadalieh acomprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT momenimaryam comprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT rashidifarmaryam comprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT balamfarinazhosseini comprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT roointanamir comprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures
AT gholaminejadalieh comprehensiveanalysisofgeneexpressionprofilingdataincovid19patientsfordiscoveryofspecificanddifferentialbloodbiomarkersignatures