Cargando…
Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate m...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5192444/ https://www.ncbi.nlm.nih.gov/pubmed/27801817 http://dx.doi.org/10.3390/metabo6040038 |
_version_ | 1782487777852522496 |
---|---|
author | Xu, Yun Muhamadali, Howbeer Sayqal, Ali Dixon, Neil Goodacre, Royston |
author_facet | Xu, Yun Muhamadali, Howbeer Sayqal, Ali Dixon, Neil Goodacre, Royston |
author_sort | Xu, Yun |
collection | PubMed |
description | Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a “pure” regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding. |
format | Online Article Text |
id | pubmed-5192444 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-51924442017-01-03 Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding Xu, Yun Muhamadali, Howbeer Sayqal, Ali Dixon, Neil Goodacre, Royston Metabolites Article Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a “pure” regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding. MDPI 2016-10-28 /pmc/articles/PMC5192444/ /pubmed/27801817 http://dx.doi.org/10.3390/metabo6040038 Text en © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Xu, Yun Muhamadali, Howbeer Sayqal, Ali Dixon, Neil Goodacre, Royston Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title | Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title_full | Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title_fullStr | Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title_full_unstemmed | Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title_short | Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding |
title_sort | partial least squares with structured output for modelling the metabolomics data obtained from complex experimental designs: a study into the y-block coding |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5192444/ https://www.ncbi.nlm.nih.gov/pubmed/27801817 http://dx.doi.org/10.3390/metabo6040038 |
work_keys_str_mv | AT xuyun partialleastsquareswithstructuredoutputformodellingthemetabolomicsdataobtainedfromcomplexexperimentaldesignsastudyintotheyblockcoding AT muhamadalihowbeer partialleastsquareswithstructuredoutputformodellingthemetabolomicsdataobtainedfromcomplexexperimentaldesignsastudyintotheyblockcoding AT sayqalali partialleastsquareswithstructuredoutputformodellingthemetabolomicsdataobtainedfromcomplexexperimentaldesignsastudyintotheyblockcoding AT dixonneil partialleastsquareswithstructuredoutputformodellingthemetabolomicsdataobtainedfromcomplexexperimentaldesignsastudyintotheyblockcoding AT goodacreroyston partialleastsquareswithstructuredoutputformodellingthemetabolomicsdataobtainedfromcomplexexperimentaldesignsastudyintotheyblockcoding |