Cargando…

Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection

[Image: see text] We propose a novel statistical approach to improve the reliability of (1)H NMR spectral analysis in complex metabolic studies. The Statistical HOmogeneous Cluster SpectroscopY (SHOCSY) algorithm aims to reduce the variation within biological classes by selecting subsets of homogene...

Descripción completa

Detalles Bibliográficos
Autores principales: Zou, Xin, Holmes, Elaine, Nicholson, Jeremy K., Loo, Ruey Leng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2014
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4110102/
https://www.ncbi.nlm.nih.gov/pubmed/24773160
http://dx.doi.org/10.1021/ac500161k
_version_ 1782327966286479360
author Zou, Xin
Holmes, Elaine
Nicholson, Jeremy K.
Loo, Ruey Leng
author_facet Zou, Xin
Holmes, Elaine
Nicholson, Jeremy K.
Loo, Ruey Leng
author_sort Zou, Xin
collection PubMed
description [Image: see text] We propose a novel statistical approach to improve the reliability of (1)H NMR spectral analysis in complex metabolic studies. The Statistical HOmogeneous Cluster SpectroscopY (SHOCSY) algorithm aims to reduce the variation within biological classes by selecting subsets of homogeneous (1)H NMR spectra that contain specific spectroscopic metabolic signatures related to each biological class in a study. In SHOCSY, we used a clustering method to categorize the whole data set into a number of clusters of samples with each cluster showing a similar spectral feature and hence biochemical composition, and we then used an enrichment test to identify the associations between the clusters and the biological classes in the data set. We evaluated the performance of the SHOCSY algorithm using a simulated (1)H NMR data set to emulate renal tubule toxicity and further exemplified this method with a (1)H NMR spectroscopic study of hydrazine-induced liver toxicity study in rats. The SHOCSY algorithm improved the predictive ability of the orthogonal partial least-squares discriminatory analysis (OPLS-DA) model through the use of “truly” representative samples in each biological class (i.e., homogeneous subsets). This method ensures that the analyses are no longer confounded by idiosyncratic responders and thus improves the reliability of biomarker extraction. SHOCSY is a useful tool for removing irrelevant variation that interfere with the interpretation and predictive ability of models and has widespread applicability to other spectroscopic data, as well as other “omics” type of data.
format Online
Article
Text
id pubmed-4110102
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-41101022014-07-25 Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection Zou, Xin Holmes, Elaine Nicholson, Jeremy K. Loo, Ruey Leng Anal Chem [Image: see text] We propose a novel statistical approach to improve the reliability of (1)H NMR spectral analysis in complex metabolic studies. The Statistical HOmogeneous Cluster SpectroscopY (SHOCSY) algorithm aims to reduce the variation within biological classes by selecting subsets of homogeneous (1)H NMR spectra that contain specific spectroscopic metabolic signatures related to each biological class in a study. In SHOCSY, we used a clustering method to categorize the whole data set into a number of clusters of samples with each cluster showing a similar spectral feature and hence biochemical composition, and we then used an enrichment test to identify the associations between the clusters and the biological classes in the data set. We evaluated the performance of the SHOCSY algorithm using a simulated (1)H NMR data set to emulate renal tubule toxicity and further exemplified this method with a (1)H NMR spectroscopic study of hydrazine-induced liver toxicity study in rats. The SHOCSY algorithm improved the predictive ability of the orthogonal partial least-squares discriminatory analysis (OPLS-DA) model through the use of “truly” representative samples in each biological class (i.e., homogeneous subsets). This method ensures that the analyses are no longer confounded by idiosyncratic responders and thus improves the reliability of biomarker extraction. SHOCSY is a useful tool for removing irrelevant variation that interfere with the interpretation and predictive ability of models and has widespread applicability to other spectroscopic data, as well as other “omics” type of data. American Chemical Society 2014-04-28 2014-06-03 /pmc/articles/PMC4110102/ /pubmed/24773160 http://dx.doi.org/10.1021/ac500161k Text en Copyright © 2014 American Chemical Society Terms of Use CC-BY (http://pubs.acs.org/page/policy/authorchoice_ccby_termsofuse.html)
spellingShingle Zou, Xin
Holmes, Elaine
Nicholson, Jeremy K.
Loo, Ruey Leng
Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title_full Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title_fullStr Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title_full_unstemmed Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title_short Statistical HOmogeneous Cluster SpectroscopY (SHOCSY): An Optimized Statistical Approach for Clustering of (1)H NMR Spectral Data to Reduce Interference and Enhance Robust Biomarkers Selection
title_sort statistical homogeneous cluster spectroscopy (shocsy): an optimized statistical approach for clustering of (1)h nmr spectral data to reduce interference and enhance robust biomarkers selection
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4110102/
https://www.ncbi.nlm.nih.gov/pubmed/24773160
http://dx.doi.org/10.1021/ac500161k
work_keys_str_mv AT zouxin statisticalhomogeneousclusterspectroscopyshocsyanoptimizedstatisticalapproachforclusteringof1hnmrspectraldatatoreduceinterferenceandenhancerobustbiomarkersselection
AT holmeselaine statisticalhomogeneousclusterspectroscopyshocsyanoptimizedstatisticalapproachforclusteringof1hnmrspectraldatatoreduceinterferenceandenhancerobustbiomarkersselection
AT nicholsonjeremyk statisticalhomogeneousclusterspectroscopyshocsyanoptimizedstatisticalapproachforclusteringof1hnmrspectraldatatoreduceinterferenceandenhancerobustbiomarkersselection
AT loorueyleng statisticalhomogeneousclusterspectroscopyshocsyanoptimizedstatisticalapproachforclusteringof1hnmrspectraldatatoreduceinterferenceandenhancerobustbiomarkersselection