Cargando…

Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection

Since the first report of SARS-CoV-2 virus in Wuhan, China in December 2019, a global outbreak of Corona Virus Disease 2019 (COVID-19) pandemic has been aroused. In the prevention of this disease, accurate diagnosis of COVID-19 is the center of the problem. However, due to the limitation of detectio...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Yanbao, Zhang, Qi, Yang, Qi, Yao, Ming, Xu, Fang, Chen, Wenyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258782/
https://www.ncbi.nlm.nih.gov/pubmed/35812497
http://dx.doi.org/10.3389/fpubh.2022.901602
_version_ 1784741623771430912
author Sun, Yanbao
Zhang, Qi
Yang, Qi
Yao, Ming
Xu, Fang
Chen, Wenyu
author_facet Sun, Yanbao
Zhang, Qi
Yang, Qi
Yao, Ming
Xu, Fang
Chen, Wenyu
author_sort Sun, Yanbao
collection PubMed
description Since the first report of SARS-CoV-2 virus in Wuhan, China in December 2019, a global outbreak of Corona Virus Disease 2019 (COVID-19) pandemic has been aroused. In the prevention of this disease, accurate diagnosis of COVID-19 is the center of the problem. However, due to the limitation of detection technology, the test results are impossible to be totally free from pseudo-positive or -negative. Improving the precision of the test results asks for the identification of more biomarkers for COVID-19. On the basis of the expression data of COVID-19 positive and negative samples, we first screened the feature genes through ReliefF, minimal-redundancy-maximum-relevancy, and Boruta_MCFS methods. Thereafter, 36 optimal feature genes were selected through incremental feature selection method based on the random forest classifier, and the enriched biological functions and signaling pathways were revealed by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes. Also, protein-protein interaction network analysis was performed on these feature genes, and the enriched biological functions and signaling pathways of main submodules were analyzed. In addition, whether these 36 feature genes could effectively distinguish positive samples from the negative ones was verified by dimensionality reduction analysis. According to the results, we inferred that the 36 feature genes selected via Boruta_MCFS could be deemed as biomarkers in COVID-19.
format Online
Article
Text
id pubmed-9258782
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-92587822022-07-07 Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection Sun, Yanbao Zhang, Qi Yang, Qi Yao, Ming Xu, Fang Chen, Wenyu Front Public Health Public Health Since the first report of SARS-CoV-2 virus in Wuhan, China in December 2019, a global outbreak of Corona Virus Disease 2019 (COVID-19) pandemic has been aroused. In the prevention of this disease, accurate diagnosis of COVID-19 is the center of the problem. However, due to the limitation of detection technology, the test results are impossible to be totally free from pseudo-positive or -negative. Improving the precision of the test results asks for the identification of more biomarkers for COVID-19. On the basis of the expression data of COVID-19 positive and negative samples, we first screened the feature genes through ReliefF, minimal-redundancy-maximum-relevancy, and Boruta_MCFS methods. Thereafter, 36 optimal feature genes were selected through incremental feature selection method based on the random forest classifier, and the enriched biological functions and signaling pathways were revealed by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes. Also, protein-protein interaction network analysis was performed on these feature genes, and the enriched biological functions and signaling pathways of main submodules were analyzed. In addition, whether these 36 feature genes could effectively distinguish positive samples from the negative ones was verified by dimensionality reduction analysis. According to the results, we inferred that the 36 feature genes selected via Boruta_MCFS could be deemed as biomarkers in COVID-19. Frontiers Media S.A. 2022-06-22 /pmc/articles/PMC9258782/ /pubmed/35812497 http://dx.doi.org/10.3389/fpubh.2022.901602 Text en Copyright © 2022 Sun, Zhang, Yang, Yao, Xu and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Sun, Yanbao
Zhang, Qi
Yang, Qi
Yao, Ming
Xu, Fang
Chen, Wenyu
Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title_full Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title_fullStr Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title_full_unstemmed Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title_short Screening of Gene Expression Markers for Corona Virus Disease 2019 Through Boruta_MCFS Feature Selection
title_sort screening of gene expression markers for corona virus disease 2019 through boruta_mcfs feature selection
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9258782/
https://www.ncbi.nlm.nih.gov/pubmed/35812497
http://dx.doi.org/10.3389/fpubh.2022.901602
work_keys_str_mv AT sunyanbao screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection
AT zhangqi screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection
AT yangqi screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection
AT yaoming screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection
AT xufang screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection
AT chenwenyu screeningofgeneexpressionmarkersforcoronavirusdisease2019throughborutamcfsfeatureselection