Cargando…
A New Pipeline for the Normalization and Pooling of Metabolomics Data
Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Sp...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467830/ https://www.ncbi.nlm.nih.gov/pubmed/34564446 http://dx.doi.org/10.3390/metabo11090631 |
_version_ | 1784573501409067008 |
---|---|
author | Viallon, Vivian His, Mathilde Rinaldi, Sabina Breeur, Marie Gicquiau, Audrey Hemon, Bertrand Overvad, Kim Tjønneland, Anne Rostgaard-Hansen, Agnetha Linn Rothwell, Joseph A. Lecuyer, Lucie Severi, Gianluca Kaaks, Rudolf Johnson, Theron Schulze, Matthias B. Palli, Domenico Agnoli, Claudia Panico, Salvatore Tumino, Rosario Ricceri, Fulvio Verschuren, W. M. Monique Engelfriet, Peter Onland-Moret, Charlotte Vermeulen, Roel Nøst, Therese Haugdahl Urbarova, Ilona Zamora-Ros, Raul Rodriguez-Barranco, Miguel Amiano, Pilar Huerta, José Maria Ardanaz, Eva Melander, Olle Ottoson, Filip Vidman, Linda Rentoft, Matilda Schmidt, Julie A. Travis, Ruth C. Weiderpass, Elisabete Johansson, Mattias Dossus, Laure Jenab, Mazda Gunter, Marc J. Lorenzo Bermejo, Justo Scherer, Dominique Salek, Reza M. Keski-Rahkonen, Pekka Ferrari, Pietro |
author_facet | Viallon, Vivian His, Mathilde Rinaldi, Sabina Breeur, Marie Gicquiau, Audrey Hemon, Bertrand Overvad, Kim Tjønneland, Anne Rostgaard-Hansen, Agnetha Linn Rothwell, Joseph A. Lecuyer, Lucie Severi, Gianluca Kaaks, Rudolf Johnson, Theron Schulze, Matthias B. Palli, Domenico Agnoli, Claudia Panico, Salvatore Tumino, Rosario Ricceri, Fulvio Verschuren, W. M. Monique Engelfriet, Peter Onland-Moret, Charlotte Vermeulen, Roel Nøst, Therese Haugdahl Urbarova, Ilona Zamora-Ros, Raul Rodriguez-Barranco, Miguel Amiano, Pilar Huerta, José Maria Ardanaz, Eva Melander, Olle Ottoson, Filip Vidman, Linda Rentoft, Matilda Schmidt, Julie A. Travis, Ruth C. Weiderpass, Elisabete Johansson, Mattias Dossus, Laure Jenab, Mazda Gunter, Marc J. Lorenzo Bermejo, Justo Scherer, Dominique Salek, Reza M. Keski-Rahkonen, Pekka Ferrari, Pietro |
author_sort | Viallon, Vivian |
collection | PubMed |
description | Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instruments. To address these issues, a new pipeline was developed to normalize and pool metabolomics data through a set of sequential steps: (i) exclusions of the least informative observations and metabolites and removal of outliers; imputation of missing data; (ii) identification of the main sources of variability through principal component partial R-square (PC-PR2) analysis; (iii) application of linear mixed models to remove unwanted variability, including samples’ originating study and batch, and preserve biological variations while accounting for potential differences in the residual variances across studies. This pipeline was applied to targeted metabolomics data acquired using Biocrates AbsoluteIDQ kits in eight case-control studies nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. Comprehensive examination of metabolomics measurements indicated that the pipeline improved the comparability of data across the studies. Our pipeline can be adapted to normalize other molecular data, including biomarkers as well as proteomics data, and could be used for pooling molecular datasets, for example in international consortia, to limit biases introduced by inter-study variability. This versatility of the pipeline makes our work of potential interest to molecular epidemiologists. |
format | Online Article Text |
id | pubmed-8467830 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-84678302021-09-27 A New Pipeline for the Normalization and Pooling of Metabolomics Data Viallon, Vivian His, Mathilde Rinaldi, Sabina Breeur, Marie Gicquiau, Audrey Hemon, Bertrand Overvad, Kim Tjønneland, Anne Rostgaard-Hansen, Agnetha Linn Rothwell, Joseph A. Lecuyer, Lucie Severi, Gianluca Kaaks, Rudolf Johnson, Theron Schulze, Matthias B. Palli, Domenico Agnoli, Claudia Panico, Salvatore Tumino, Rosario Ricceri, Fulvio Verschuren, W. M. Monique Engelfriet, Peter Onland-Moret, Charlotte Vermeulen, Roel Nøst, Therese Haugdahl Urbarova, Ilona Zamora-Ros, Raul Rodriguez-Barranco, Miguel Amiano, Pilar Huerta, José Maria Ardanaz, Eva Melander, Olle Ottoson, Filip Vidman, Linda Rentoft, Matilda Schmidt, Julie A. Travis, Ruth C. Weiderpass, Elisabete Johansson, Mattias Dossus, Laure Jenab, Mazda Gunter, Marc J. Lorenzo Bermejo, Justo Scherer, Dominique Salek, Reza M. Keski-Rahkonen, Pekka Ferrari, Pietro Metabolites Article Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instruments. To address these issues, a new pipeline was developed to normalize and pool metabolomics data through a set of sequential steps: (i) exclusions of the least informative observations and metabolites and removal of outliers; imputation of missing data; (ii) identification of the main sources of variability through principal component partial R-square (PC-PR2) analysis; (iii) application of linear mixed models to remove unwanted variability, including samples’ originating study and batch, and preserve biological variations while accounting for potential differences in the residual variances across studies. This pipeline was applied to targeted metabolomics data acquired using Biocrates AbsoluteIDQ kits in eight case-control studies nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. Comprehensive examination of metabolomics measurements indicated that the pipeline improved the comparability of data across the studies. Our pipeline can be adapted to normalize other molecular data, including biomarkers as well as proteomics data, and could be used for pooling molecular datasets, for example in international consortia, to limit biases introduced by inter-study variability. This versatility of the pipeline makes our work of potential interest to molecular epidemiologists. MDPI 2021-09-17 /pmc/articles/PMC8467830/ /pubmed/34564446 http://dx.doi.org/10.3390/metabo11090631 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Viallon, Vivian His, Mathilde Rinaldi, Sabina Breeur, Marie Gicquiau, Audrey Hemon, Bertrand Overvad, Kim Tjønneland, Anne Rostgaard-Hansen, Agnetha Linn Rothwell, Joseph A. Lecuyer, Lucie Severi, Gianluca Kaaks, Rudolf Johnson, Theron Schulze, Matthias B. Palli, Domenico Agnoli, Claudia Panico, Salvatore Tumino, Rosario Ricceri, Fulvio Verschuren, W. M. Monique Engelfriet, Peter Onland-Moret, Charlotte Vermeulen, Roel Nøst, Therese Haugdahl Urbarova, Ilona Zamora-Ros, Raul Rodriguez-Barranco, Miguel Amiano, Pilar Huerta, José Maria Ardanaz, Eva Melander, Olle Ottoson, Filip Vidman, Linda Rentoft, Matilda Schmidt, Julie A. Travis, Ruth C. Weiderpass, Elisabete Johansson, Mattias Dossus, Laure Jenab, Mazda Gunter, Marc J. Lorenzo Bermejo, Justo Scherer, Dominique Salek, Reza M. Keski-Rahkonen, Pekka Ferrari, Pietro A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title | A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title_full | A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title_fullStr | A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title_full_unstemmed | A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title_short | A New Pipeline for the Normalization and Pooling of Metabolomics Data |
title_sort | new pipeline for the normalization and pooling of metabolomics data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467830/ https://www.ncbi.nlm.nih.gov/pubmed/34564446 http://dx.doi.org/10.3390/metabo11090631 |
work_keys_str_mv | AT viallonvivian anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT hismathilde anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT rinaldisabina anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT breeurmarie anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT gicquiauaudrey anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT hemonbertrand anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT overvadkim anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT tjønnelandanne anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT rostgaardhansenagnethalinn anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT rothwelljosepha anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT lecuyerlucie anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT severigianluca anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT kaaksrudolf anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT johnsontheron anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT schulzematthiasb anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT pallidomenico anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT agnoliclaudia anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT panicosalvatore anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT tuminorosario anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT riccerifulvio anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT verschurenwmmonique anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT engelfrietpeter anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT onlandmoretcharlotte anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT vermeulenroel anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT nøsttheresehaugdahl anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT urbarovailona anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT zamorarosraul anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT rodriguezbarrancomiguel anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT amianopilar anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT huertajosemaria anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT ardanazeva anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT melanderolle anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT ottosonfilip anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT vidmanlinda anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT rentoftmatilda anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT schmidtjuliea anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT travisruthc anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT weiderpasselisabete anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT johanssonmattias anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT dossuslaure anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT jenabmazda anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT guntermarcj anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT lorenzobermejojusto anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT schererdominique anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT salekrezam anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT keskirahkonenpekka anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT ferraripietro anewpipelineforthenormalizationandpoolingofmetabolomicsdata AT viallonvivian newpipelineforthenormalizationandpoolingofmetabolomicsdata AT hismathilde newpipelineforthenormalizationandpoolingofmetabolomicsdata AT rinaldisabina newpipelineforthenormalizationandpoolingofmetabolomicsdata AT breeurmarie newpipelineforthenormalizationandpoolingofmetabolomicsdata AT gicquiauaudrey newpipelineforthenormalizationandpoolingofmetabolomicsdata AT hemonbertrand newpipelineforthenormalizationandpoolingofmetabolomicsdata AT overvadkim newpipelineforthenormalizationandpoolingofmetabolomicsdata AT tjønnelandanne newpipelineforthenormalizationandpoolingofmetabolomicsdata AT rostgaardhansenagnethalinn newpipelineforthenormalizationandpoolingofmetabolomicsdata AT rothwelljosepha newpipelineforthenormalizationandpoolingofmetabolomicsdata AT lecuyerlucie newpipelineforthenormalizationandpoolingofmetabolomicsdata AT severigianluca newpipelineforthenormalizationandpoolingofmetabolomicsdata AT kaaksrudolf newpipelineforthenormalizationandpoolingofmetabolomicsdata AT johnsontheron newpipelineforthenormalizationandpoolingofmetabolomicsdata AT schulzematthiasb newpipelineforthenormalizationandpoolingofmetabolomicsdata AT pallidomenico newpipelineforthenormalizationandpoolingofmetabolomicsdata AT agnoliclaudia newpipelineforthenormalizationandpoolingofmetabolomicsdata AT panicosalvatore newpipelineforthenormalizationandpoolingofmetabolomicsdata AT tuminorosario newpipelineforthenormalizationandpoolingofmetabolomicsdata AT riccerifulvio newpipelineforthenormalizationandpoolingofmetabolomicsdata AT verschurenwmmonique newpipelineforthenormalizationandpoolingofmetabolomicsdata AT engelfrietpeter newpipelineforthenormalizationandpoolingofmetabolomicsdata AT onlandmoretcharlotte newpipelineforthenormalizationandpoolingofmetabolomicsdata AT vermeulenroel newpipelineforthenormalizationandpoolingofmetabolomicsdata AT nøsttheresehaugdahl newpipelineforthenormalizationandpoolingofmetabolomicsdata AT urbarovailona newpipelineforthenormalizationandpoolingofmetabolomicsdata AT zamorarosraul newpipelineforthenormalizationandpoolingofmetabolomicsdata AT rodriguezbarrancomiguel newpipelineforthenormalizationandpoolingofmetabolomicsdata AT amianopilar newpipelineforthenormalizationandpoolingofmetabolomicsdata AT huertajosemaria newpipelineforthenormalizationandpoolingofmetabolomicsdata AT ardanazeva newpipelineforthenormalizationandpoolingofmetabolomicsdata AT melanderolle newpipelineforthenormalizationandpoolingofmetabolomicsdata AT ottosonfilip newpipelineforthenormalizationandpoolingofmetabolomicsdata AT vidmanlinda newpipelineforthenormalizationandpoolingofmetabolomicsdata AT rentoftmatilda newpipelineforthenormalizationandpoolingofmetabolomicsdata AT schmidtjuliea newpipelineforthenormalizationandpoolingofmetabolomicsdata AT travisruthc newpipelineforthenormalizationandpoolingofmetabolomicsdata AT weiderpasselisabete newpipelineforthenormalizationandpoolingofmetabolomicsdata AT johanssonmattias newpipelineforthenormalizationandpoolingofmetabolomicsdata AT dossuslaure newpipelineforthenormalizationandpoolingofmetabolomicsdata AT jenabmazda newpipelineforthenormalizationandpoolingofmetabolomicsdata AT guntermarcj newpipelineforthenormalizationandpoolingofmetabolomicsdata AT lorenzobermejojusto newpipelineforthenormalizationandpoolingofmetabolomicsdata AT schererdominique newpipelineforthenormalizationandpoolingofmetabolomicsdata AT salekrezam newpipelineforthenormalizationandpoolingofmetabolomicsdata AT keskirahkonenpekka newpipelineforthenormalizationandpoolingofmetabolomicsdata AT ferraripietro newpipelineforthenormalizationandpoolingofmetabolomicsdata |