Cargando…

Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may imp...

Descripción completa

Detalles Bibliográficos
Autores principales: Bazzoli, Caroline, Lambert-Lacroix, Sophie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127926/
https://www.ncbi.nlm.nih.gov/pubmed/30189832
http://dx.doi.org/10.1186/s12859-018-2311-2
_version_ 1783353558428549120
author Bazzoli, Caroline
Lambert-Lacroix, Sophie
author_facet Bazzoli, Caroline
Lambert-Lacroix, Sophie
author_sort Bazzoli, Caroline
collection PubMed
description BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider here methods for classification purposes that simultaneously use both types of variables but apply dimensionality reduction only to the high-dimensional genomic ones. RESULTS: Using partial least squares (PLS), we propose some one-step approaches based on three extensions of the least squares (LS)-PLS method for logistic regression. A comparison of their prediction performances via a simulation and on real data sets from cancer studies is conducted. CONCLUSION: In general, those methods using only clinical data or only genomic data perform poorly. The advantage of using LS-PLS methods for classification and their performances are shown and then used to analyze clinical and genomic data. The corresponding prediction results are encouraging and stable regardless of the data set and/or number of selected features. These extensions have been implemented in the R package lsplsGlm to enhance their use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2311-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6127926
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-61279262018-09-10 Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data Bazzoli, Caroline Lambert-Lacroix, Sophie BMC Bioinformatics Methodology Article BACKGROUND: To address high-dimensional genomic data, most of the proposed prediction methods make use of genomic data alone without considering clinical data, which are often available and known to have predictive value. Recent studies suggest that combining clinical and genomic information may improve predictions. We consider here methods for classification purposes that simultaneously use both types of variables but apply dimensionality reduction only to the high-dimensional genomic ones. RESULTS: Using partial least squares (PLS), we propose some one-step approaches based on three extensions of the least squares (LS)-PLS method for logistic regression. A comparison of their prediction performances via a simulation and on real data sets from cancer studies is conducted. CONCLUSION: In general, those methods using only clinical data or only genomic data perform poorly. The advantage of using LS-PLS methods for classification and their performances are shown and then used to analyze clinical and genomic data. The corresponding prediction results are encouraging and stable regardless of the data set and/or number of selected features. These extensions have been implemented in the R package lsplsGlm to enhance their use. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2311-2) contains supplementary material, which is available to authorized users. BioMed Central 2018-09-06 /pmc/articles/PMC6127926/ /pubmed/30189832 http://dx.doi.org/10.1186/s12859-018-2311-2 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Bazzoli, Caroline
Lambert-Lacroix, Sophie
Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title_full Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title_fullStr Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title_full_unstemmed Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title_short Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data
title_sort classification based on extensions of ls-pls using logistic regression: application to clinical and multiple genomic data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6127926/
https://www.ncbi.nlm.nih.gov/pubmed/30189832
http://dx.doi.org/10.1186/s12859-018-2311-2
work_keys_str_mv AT bazzolicaroline classificationbasedonextensionsoflsplsusinglogisticregressionapplicationtoclinicalandmultiplegenomicdata
AT lambertlacroixsophie classificationbasedonextensionsoflsplsusinglogisticregressionapplicationtoclinicalandmultiplegenomicdata