Cargando…

Robust classification using average correlations as features (ACF)

MOTIVATION: In single-cell transcriptomics and other omics technologies, large fractions of missing values commonly occur. Researchers often either consider only those features that were measured for each instance of their dataset, thereby accepting severe loss of information, or use imputation whic...

Descripción completa

Detalles Bibliográficos
Autores principales: Schumann, Yannis, Neumann, Julia E., Neumann, Philipp
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10026437/
https://www.ncbi.nlm.nih.gov/pubmed/36941542
http://dx.doi.org/10.1186/s12859-023-05224-0
_version_ 1784909540206051328
author Schumann, Yannis
Neumann, Julia E.
Neumann, Philipp
author_facet Schumann, Yannis
Neumann, Julia E.
Neumann, Philipp
author_sort Schumann, Yannis
collection PubMed
description MOTIVATION: In single-cell transcriptomics and other omics technologies, large fractions of missing values commonly occur. Researchers often either consider only those features that were measured for each instance of their dataset, thereby accepting severe loss of information, or use imputation which can lead to erroneous results. Pairwise metrics allow for imputation-free classification with minimal loss of data. RESULTS: Using pairwise correlations as metric, state-of-the-art approaches to classification would include the K-nearest-neighbor- (KNN) and distribution-based-classification-classifier. Our novel method, termed average correlations as features (ACF), significantly outperforms those approaches by training tunable machine learning models on inter-class and intra-class correlations. Our approach is characterized in simulation studies and its classification performance is demonstrated on real-world datasets from single-cell RNA sequencing and bottom-up proteomics. Furthermore, we demonstrate that variants of our method offer superior flexibility and performance over KNN classifiers and can be used in conjunction with other machine learning methods. In summary, ACF is a flexible method that enables missing value tolerant classification with minimal loss of data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05224-0.
format Online
Article
Text
id pubmed-10026437
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-100264372023-03-21 Robust classification using average correlations as features (ACF) Schumann, Yannis Neumann, Julia E. Neumann, Philipp BMC Bioinformatics Research MOTIVATION: In single-cell transcriptomics and other omics technologies, large fractions of missing values commonly occur. Researchers often either consider only those features that were measured for each instance of their dataset, thereby accepting severe loss of information, or use imputation which can lead to erroneous results. Pairwise metrics allow for imputation-free classification with minimal loss of data. RESULTS: Using pairwise correlations as metric, state-of-the-art approaches to classification would include the K-nearest-neighbor- (KNN) and distribution-based-classification-classifier. Our novel method, termed average correlations as features (ACF), significantly outperforms those approaches by training tunable machine learning models on inter-class and intra-class correlations. Our approach is characterized in simulation studies and its classification performance is demonstrated on real-world datasets from single-cell RNA sequencing and bottom-up proteomics. Furthermore, we demonstrate that variants of our method offer superior flexibility and performance over KNN classifiers and can be used in conjunction with other machine learning methods. In summary, ACF is a flexible method that enables missing value tolerant classification with minimal loss of data. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-023-05224-0. BioMed Central 2023-03-20 /pmc/articles/PMC10026437/ /pubmed/36941542 http://dx.doi.org/10.1186/s12859-023-05224-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Schumann, Yannis
Neumann, Julia E.
Neumann, Philipp
Robust classification using average correlations as features (ACF)
title Robust classification using average correlations as features (ACF)
title_full Robust classification using average correlations as features (ACF)
title_fullStr Robust classification using average correlations as features (ACF)
title_full_unstemmed Robust classification using average correlations as features (ACF)
title_short Robust classification using average correlations as features (ACF)
title_sort robust classification using average correlations as features (acf)
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10026437/
https://www.ncbi.nlm.nih.gov/pubmed/36941542
http://dx.doi.org/10.1186/s12859-023-05224-0
work_keys_str_mv AT schumannyannis robustclassificationusingaveragecorrelationsasfeaturesacf
AT neumannjuliae robustclassificationusingaveragecorrelationsasfeaturesacf
AT neumannphilipp robustclassificationusingaveragecorrelationsasfeaturesacf