Cargando…
A hierarchical approach to removal of unwanted variation for large-scale metabolomics data
Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8371158/ https://www.ncbi.nlm.nih.gov/pubmed/34404777 http://dx.doi.org/10.1038/s41467-021-25210-5 |
_version_ | 1783739581980475392 |
---|---|
author | Kim, Taiyun Tang, Owen Vernon, Stephen T. Kott, Katharine A. Koay, Yen Chin Park, John James, David E. Grieve, Stuart M. Speed, Terence P. Yang, Pengyi Figtree, Gemma A. O’Sullivan, John F. Yang, Jean Yee Hwa |
author_facet | Kim, Taiyun Tang, Owen Vernon, Stephen T. Kott, Katharine A. Koay, Yen Chin Park, John James, David E. Grieve, Stuart M. Speed, Terence P. Yang, Pengyi Figtree, Gemma A. O’Sullivan, John F. Yang, Jean Yee Hwa |
author_sort | Kim, Taiyun |
collection | PubMed |
description | Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological signals and thus hinder potential biological discoveries. To date, normalisation approaches have struggled to mitigate the variability introduced by technical factors whilst preserving biological variance, especially for protracted acquisitions. Here, we propose a study design framework with an arrangement for embedding biological sample replicates to quantify variance within and between batches and a workflow that uses these replicates to remove unwanted variation in a hierarchical manner (hRUV). We use this design to produce a dataset of more than 1000 human plasma samples run over an extended period of time. We demonstrate significant improvement of hRUV over existing methods in preserving biological signals whilst removing unwanted variation for large scale metabolomics studies. Our tools not only provide a strategy for large scale data normalisation, but also provides guidance on the design strategy for large omics studies. |
format | Online Article Text |
id | pubmed-8371158 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-83711582021-09-02 A hierarchical approach to removal of unwanted variation for large-scale metabolomics data Kim, Taiyun Tang, Owen Vernon, Stephen T. Kott, Katharine A. Koay, Yen Chin Park, John James, David E. Grieve, Stuart M. Speed, Terence P. Yang, Pengyi Figtree, Gemma A. O’Sullivan, John F. Yang, Jean Yee Hwa Nat Commun Article Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological signals and thus hinder potential biological discoveries. To date, normalisation approaches have struggled to mitigate the variability introduced by technical factors whilst preserving biological variance, especially for protracted acquisitions. Here, we propose a study design framework with an arrangement for embedding biological sample replicates to quantify variance within and between batches and a workflow that uses these replicates to remove unwanted variation in a hierarchical manner (hRUV). We use this design to produce a dataset of more than 1000 human plasma samples run over an extended period of time. We demonstrate significant improvement of hRUV over existing methods in preserving biological signals whilst removing unwanted variation for large scale metabolomics studies. Our tools not only provide a strategy for large scale data normalisation, but also provides guidance on the design strategy for large omics studies. Nature Publishing Group UK 2021-08-17 /pmc/articles/PMC8371158/ /pubmed/34404777 http://dx.doi.org/10.1038/s41467-021-25210-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Kim, Taiyun Tang, Owen Vernon, Stephen T. Kott, Katharine A. Koay, Yen Chin Park, John James, David E. Grieve, Stuart M. Speed, Terence P. Yang, Pengyi Figtree, Gemma A. O’Sullivan, John F. Yang, Jean Yee Hwa A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title | A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title_full | A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title_fullStr | A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title_full_unstemmed | A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title_short | A hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
title_sort | hierarchical approach to removal of unwanted variation for large-scale metabolomics data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8371158/ https://www.ncbi.nlm.nih.gov/pubmed/34404777 http://dx.doi.org/10.1038/s41467-021-25210-5 |
work_keys_str_mv | AT kimtaiyun ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT tangowen ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT vernonstephent ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT kottkatharinea ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT koayyenchin ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT parkjohn ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT jamesdavide ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT grievestuartm ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT speedterencep ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT yangpengyi ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT figtreegemmaa ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT osullivanjohnf ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT yangjeanyeehwa ahierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT kimtaiyun hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT tangowen hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT vernonstephent hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT kottkatharinea hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT koayyenchin hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT parkjohn hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT jamesdavide hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT grievestuartm hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT speedterencep hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT yangpengyi hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT figtreegemmaa hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT osullivanjohnf hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata AT yangjeanyeehwa hierarchicalapproachtoremovalofunwantedvariationforlargescalemetabolomicsdata |