Cargando…

stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets

This paper presents the R/Bioconductor package stepwiseCM, which classifies cancer samples using two heterogeneous data sets in an efficient way. The algorithm is able to capture the distinct classification power of two given data types without actually combining them. This package suits for classif...

Descripción completa

Detalles Bibliográficos
Autores principales: Obulkasim, Askar, van de Wiel, Mark A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3885337/
https://www.ncbi.nlm.nih.gov/pubmed/24770370
http://dx.doi.org/10.4137/CIN.S13075
_version_ 1782298735426928640
author Obulkasim, Askar
van de Wiel, Mark A
author_facet Obulkasim, Askar
van de Wiel, Mark A
author_sort Obulkasim, Askar
collection PubMed
description This paper presents the R/Bioconductor package stepwiseCM, which classifies cancer samples using two heterogeneous data sets in an efficient way. The algorithm is able to capture the distinct classification power of two given data types without actually combining them. This package suits for classification problems where two different types of data sets on the same samples are available. One of these data types has measurements on all samples and the other one has measurements on some samples. One is easy to collect and/or relatively cheap (eg, clinical covariates) compared to the latter (high-dimensional data, eg, gene expression). One additional application for which stepwiseCM is proven to be useful as well is the combination of two high-dimensional data types, eg, DNA copy number and mRNA expression. The package includes functions to project the neighborhood information in one data space to the other to determine a potential group of samples that are likely to benefit most by measuring the second type of covariates. The two heterogeneous data spaces are connected by indirect mapping. The crucial difference between the stepwise classification strategy implemented in this package and the existing packages is that our approach aims to be cost-efficient by avoiding measuring additional covariates, which might be expensive or patient-unfriendly, for a potentially large subgroup of individuals. Moreover, in diagnosis for these individuals test, results would be quickly available, which may lead to reduced waiting times and hence lower the patients’ distress. The improvement described remedies the key limitations of existing packages, and facilitates the use of the stepwiseCM package in diverse applications.
format Online
Article
Text
id pubmed-3885337
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-38853372014-01-16 stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets Obulkasim, Askar van de Wiel, Mark A Cancer Inform Software Review This paper presents the R/Bioconductor package stepwiseCM, which classifies cancer samples using two heterogeneous data sets in an efficient way. The algorithm is able to capture the distinct classification power of two given data types without actually combining them. This package suits for classification problems where two different types of data sets on the same samples are available. One of these data types has measurements on all samples and the other one has measurements on some samples. One is easy to collect and/or relatively cheap (eg, clinical covariates) compared to the latter (high-dimensional data, eg, gene expression). One additional application for which stepwiseCM is proven to be useful as well is the combination of two high-dimensional data types, eg, DNA copy number and mRNA expression. The package includes functions to project the neighborhood information in one data space to the other to determine a potential group of samples that are likely to benefit most by measuring the second type of covariates. The two heterogeneous data spaces are connected by indirect mapping. The crucial difference between the stepwise classification strategy implemented in this package and the existing packages is that our approach aims to be cost-efficient by avoiding measuring additional covariates, which might be expensive or patient-unfriendly, for a potentially large subgroup of individuals. Moreover, in diagnosis for these individuals test, results would be quickly available, which may lead to reduced waiting times and hence lower the patients’ distress. The improvement described remedies the key limitations of existing packages, and facilitates the use of the stepwiseCM package in diverse applications. Libertas Academica 2014-01-02 /pmc/articles/PMC3885337/ /pubmed/24770370 http://dx.doi.org/10.4137/CIN.S13075 Text en © 2014 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.
spellingShingle Software Review
Obulkasim, Askar
van de Wiel, Mark A
stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title_full stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title_fullStr stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title_full_unstemmed stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title_short stepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
title_sort stepwisecm: an r package for stepwise classification of cancer samples using multiple heterogeneous data sets
topic Software Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3885337/
https://www.ncbi.nlm.nih.gov/pubmed/24770370
http://dx.doi.org/10.4137/CIN.S13075
work_keys_str_mv AT obulkasimaskar stepwisecmanrpackageforstepwiseclassificationofcancersamplesusingmultipleheterogeneousdatasets
AT vandewielmarka stepwisecmanrpackageforstepwiseclassificationofcancersamplesusingmultipleheterogeneousdatasets