Cargando…
Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may imp...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127926/ https://www.ncbi.nlm.nih.gov/pubmed/30189832 http://dx.doi.org/10.1186/s12859-018-2311-2 |
_version_ | 1783353558428549120 |
---|---|
author | Bazzoli, Caroline Lambert-Lacroix, Sophie |
author_facet | Bazzoli, Caroline Lambert-Lacroix, Sophie |
author_sort | Bazzoli, Caroline |
collection | PubMed |
description | BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider here methods for classification purposes that simultaneously use both types of variables but apply dimensionality reduction only to the high-dimensional genomic ones. RESULTS: Using partial least squares (PLS), we propose some one-step approaches based on three extensions of the least squares (LS)-PLS method for logistic regression. A comparison of their prediction performances via a simulation and on real data sets from cancer studies is conducted. CONCLUSION: In general, those methods using only clinical data or only genomic data perform poorly. The advantage of using LS-PLS methods for classification and their performances are shown and then used to analyze clinical and genomic data. The corresponding prediction results are encouraging and stable regardless of the data set and/or number of selected features. These extensions have been implemented in the R package lsplsGlm to enhance their use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2311-2) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6127926 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-61279262018-09-10 Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data Bazzoli, Caroline Lambert-Lacroix, Sophie BMC Bioinformatics Methodology Article BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider here methods for classification purposes that simultaneously use both types of variables but apply dimensionality reduction only to the high-dimensional genomic ones. RESULTS: Using partial least squares (PLS), we propose some one-step approaches based on three extensions of the least squares (LS)-PLS method for logistic regression. A comparison of their prediction performances via a simulation and on real data sets from cancer studies is conducted. CONCLUSION: In general, those methods using only clinical data or only genomic data perform poorly. The advantage of using LS-PLS methods for classification and their performances are shown and then used to analyze clinical and genomic data. The corresponding prediction results are encouraging and stable regardless of the data set and/or number of selected features. These extensions have been implemented in the R package lsplsGlm to enhance their use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2311-2) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-06 /pmc/articles/PMC6127926/ /pubmed/30189832 http://dx.doi.org/10.1186/s12859-018-2311-2 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Bazzoli, Caroline Lambert-Lacroix, Sophie Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title | Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title_full | Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title_fullStr | Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title_full_unstemmed | Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title_short | Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data |
title_sort | classification based on extensions of ls-pls using logistic regression: application to clinical and multiple genomic data |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127926/ https://www.ncbi.nlm.nih.gov/pubmed/30189832 http://dx.doi.org/10.1186/s12859-018-2311-2 |
work_keys_str_mv | AT bazzolicaroline classificationbasedonextensionsoflsplsusinglogisticregressionapplicationtoclinicalandmultiplegenomicdata AT lambertlacroixsophie classificationbasedonextensionsoflsplsusinglogisticregressionapplicationtoclinicalandmultiplegenomicdata |