Cargando…
Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data
The T and B cell repertoire make up the adaptive immune system and is mainly generated through somatic V(D)J gene recombination. Thus, the VJ gene usage may be a potential prognostic or predictive biomarker. However, analysis of the adaptive immune system is challenging due to the heterogeneity of t...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9086194/ https://www.ncbi.nlm.nih.gov/pubmed/35559031 http://dx.doi.org/10.3389/fgene.2022.821832 |
_version_ | 1784703944225718272 |
---|---|
author | He, Tao Baik, Jason Min Kato, Chiemi Yang, Hai Fan, Zenghua Cham, Jason Zhang, Li |
author_facet | He, Tao Baik, Jason Min Kato, Chiemi Yang, Hai Fan, Zenghua Cham, Jason Zhang, Li |
author_sort | He, Tao |
collection | PubMed |
description | The T and B cell repertoire make up the adaptive immune system and is mainly generated through somatic V(D)J gene recombination. Thus, the VJ gene usage may be a potential prognostic or predictive biomarker. However, analysis of the adaptive immune system is challenging due to the heterogeneity of the clonotypes that make up the repertoire. To address the heterogeneity of the T and B cell repertoire, we proposed a novel ensemble feature selection approach and customized statistical learning algorithm focusing on the VJ gene usage. We applied the proposed approach to T cell receptor sequences from recovered COVID-19 patients and healthy donors, as well as a group of lung cancer patients who received immunotherapy. Our approach identified distinct VJ genes used in the COVID-19 recovered patients comparing to the healthy donors and the VJ genes associated with the clinical response in the lung cancer patients. Simulation studies show that the ensemble feature selection approach outperformed other state-of-the-art feature selection methods based on both efficiency and accuracy. It consistently yielded higher stability and sensitivity with lower false discovery rates. When integrated with different classification methods, the ensemble feature selection approach had the best prediction accuracy. In conclusion, the proposed novel approach and the integration procedure is an effective feature selection technique to aid in correctly classifying different subtypes to better understand the signatures in the adaptive immune response associated with disease or the treatment in order to improve treatment strategies. |
format | Online Article Text |
id | pubmed-9086194 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-90861942022-05-11 Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data He, Tao Baik, Jason Min Kato, Chiemi Yang, Hai Fan, Zenghua Cham, Jason Zhang, Li Front Genet Genetics The T and B cell repertoire make up the adaptive immune system and is mainly generated through somatic V(D)J gene recombination. Thus, the VJ gene usage may be a potential prognostic or predictive biomarker. However, analysis of the adaptive immune system is challenging due to the heterogeneity of the clonotypes that make up the repertoire. To address the heterogeneity of the T and B cell repertoire, we proposed a novel ensemble feature selection approach and customized statistical learning algorithm focusing on the VJ gene usage. We applied the proposed approach to T cell receptor sequences from recovered COVID-19 patients and healthy donors, as well as a group of lung cancer patients who received immunotherapy. Our approach identified distinct VJ genes used in the COVID-19 recovered patients comparing to the healthy donors and the VJ genes associated with the clinical response in the lung cancer patients. Simulation studies show that the ensemble feature selection approach outperformed other state-of-the-art feature selection methods based on both efficiency and accuracy. It consistently yielded higher stability and sensitivity with lower false discovery rates. When integrated with different classification methods, the ensemble feature selection approach had the best prediction accuracy. In conclusion, the proposed novel approach and the integration procedure is an effective feature selection technique to aid in correctly classifying different subtypes to better understand the signatures in the adaptive immune response associated with disease or the treatment in order to improve treatment strategies. Frontiers Media S.A. 2022-04-26 /pmc/articles/PMC9086194/ /pubmed/35559031 http://dx.doi.org/10.3389/fgene.2022.821832 Text en Copyright © 2022 He, Baik, Kato, Yang, Fan, Cham and Zhang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics He, Tao Baik, Jason Min Kato, Chiemi Yang, Hai Fan, Zenghua Cham, Jason Zhang, Li Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title | Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title_full | Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title_fullStr | Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title_full_unstemmed | Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title_short | Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data |
title_sort | novel ensemble feature selection approach and application in repertoire sequencing data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9086194/ https://www.ncbi.nlm.nih.gov/pubmed/35559031 http://dx.doi.org/10.3389/fgene.2022.821832 |
work_keys_str_mv | AT hetao novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT baikjasonmin novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT katochiemi novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT yanghai novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT fanzenghua novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT chamjason novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata AT zhangli novelensemblefeatureselectionapproachandapplicationinrepertoiresequencingdata |