Cargando…

A New Pipeline for the Normalization and Pooling of Metabolomics Data

Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Sp...

Descripción completa

Detalles Bibliográficos
Autores principales: Viallon, Vivian, His, Mathilde, Rinaldi, Sabina, Breeur, Marie, Gicquiau, Audrey, Hemon, Bertrand, Overvad, Kim, Tjønneland, Anne, Rostgaard-Hansen, Agnetha Linn, Rothwell, Joseph A., Lecuyer, Lucie, Severi, Gianluca, Kaaks, Rudolf, Johnson, Theron, Schulze, Matthias B., Palli, Domenico, Agnoli, Claudia, Panico, Salvatore, Tumino, Rosario, Ricceri, Fulvio, Verschuren, W. M. Monique, Engelfriet, Peter, Onland-Moret, Charlotte, Vermeulen, Roel, Nøst, Therese Haugdahl, Urbarova, Ilona, Zamora-Ros, Raul, Rodriguez-Barranco, Miguel, Amiano, Pilar, Huerta, José Maria, Ardanaz, Eva, Melander, Olle, Ottoson, Filip, Vidman, Linda, Rentoft, Matilda, Schmidt, Julie A., Travis, Ruth C., Weiderpass, Elisabete, Johansson, Mattias, Dossus, Laure, Jenab, Mazda, Gunter, Marc J., Lorenzo Bermejo, Justo, Scherer, Dominique, Salek, Reza M., Keski-Rahkonen, Pekka, Ferrari, Pietro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467830/
https://www.ncbi.nlm.nih.gov/pubmed/34564446
http://dx.doi.org/10.3390/metabo11090631
_version_ 1784573501409067008
author Viallon, Vivian
His, Mathilde
Rinaldi, Sabina
Breeur, Marie
Gicquiau, Audrey
Hemon, Bertrand
Overvad, Kim
Tjønneland, Anne
Rostgaard-Hansen, Agnetha Linn
Rothwell, Joseph A.
Lecuyer, Lucie
Severi, Gianluca
Kaaks, Rudolf
Johnson, Theron
Schulze, Matthias B.
Palli, Domenico
Agnoli, Claudia
Panico, Salvatore
Tumino, Rosario
Ricceri, Fulvio
Verschuren, W. M. Monique
Engelfriet, Peter
Onland-Moret, Charlotte
Vermeulen, Roel
Nøst, Therese Haugdahl
Urbarova, Ilona
Zamora-Ros, Raul
Rodriguez-Barranco, Miguel
Amiano, Pilar
Huerta, José Maria
Ardanaz, Eva
Melander, Olle
Ottoson, Filip
Vidman, Linda
Rentoft, Matilda
Schmidt, Julie A.
Travis, Ruth C.
Weiderpass, Elisabete
Johansson, Mattias
Dossus, Laure
Jenab, Mazda
Gunter, Marc J.
Lorenzo Bermejo, Justo
Scherer, Dominique
Salek, Reza M.
Keski-Rahkonen, Pekka
Ferrari, Pietro
author_facet Viallon, Vivian
His, Mathilde
Rinaldi, Sabina
Breeur, Marie
Gicquiau, Audrey
Hemon, Bertrand
Overvad, Kim
Tjønneland, Anne
Rostgaard-Hansen, Agnetha Linn
Rothwell, Joseph A.
Lecuyer, Lucie
Severi, Gianluca
Kaaks, Rudolf
Johnson, Theron
Schulze, Matthias B.
Palli, Domenico
Agnoli, Claudia
Panico, Salvatore
Tumino, Rosario
Ricceri, Fulvio
Verschuren, W. M. Monique
Engelfriet, Peter
Onland-Moret, Charlotte
Vermeulen, Roel
Nøst, Therese Haugdahl
Urbarova, Ilona
Zamora-Ros, Raul
Rodriguez-Barranco, Miguel
Amiano, Pilar
Huerta, José Maria
Ardanaz, Eva
Melander, Olle
Ottoson, Filip
Vidman, Linda
Rentoft, Matilda
Schmidt, Julie A.
Travis, Ruth C.
Weiderpass, Elisabete
Johansson, Mattias
Dossus, Laure
Jenab, Mazda
Gunter, Marc J.
Lorenzo Bermejo, Justo
Scherer, Dominique
Salek, Reza M.
Keski-Rahkonen, Pekka
Ferrari, Pietro
author_sort Viallon, Vivian
collection PubMed
description Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instruments. To address these issues, a new pipeline was developed to normalize and pool metabolomics data through a set of sequential steps: (i) exclusions of the least informative observations and metabolites and removal of outliers; imputation of missing data; (ii) identification of the main sources of variability through principal component partial R-square (PC-PR2) analysis; (iii) application of linear mixed models to remove unwanted variability, including samples’ originating study and batch, and preserve biological variations while accounting for potential differences in the residual variances across studies. This pipeline was applied to targeted metabolomics data acquired using Biocrates AbsoluteIDQ kits in eight case-control studies nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. Comprehensive examination of metabolomics measurements indicated that the pipeline improved the comparability of data across the studies. Our pipeline can be adapted to normalize other molecular data, including biomarkers as well as proteomics data, and could be used for pooling molecular datasets, for example in international consortia, to limit biases introduced by inter-study variability. This versatility of the pipeline makes our work of potential interest to molecular epidemiologists.
format Online
Article
Text
id pubmed-8467830
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-84678302021-09-27 A New Pipeline for the Normalization and Pooling of Metabolomics Data Viallon, Vivian His, Mathilde Rinaldi, Sabina Breeur, Marie Gicquiau, Audrey Hemon, Bertrand Overvad, Kim Tjønneland, Anne Rostgaard-Hansen, Agnetha Linn Rothwell, Joseph A. Lecuyer, Lucie Severi, Gianluca Kaaks, Rudolf Johnson, Theron Schulze, Matthias B. Palli, Domenico Agnoli, Claudia Panico, Salvatore Tumino, Rosario Ricceri, Fulvio Verschuren, W. M. Monique Engelfriet, Peter Onland-Moret, Charlotte Vermeulen, Roel Nøst, Therese Haugdahl Urbarova, Ilona Zamora-Ros, Raul Rodriguez-Barranco, Miguel Amiano, Pilar Huerta, José Maria Ardanaz, Eva Melander, Olle Ottoson, Filip Vidman, Linda Rentoft, Matilda Schmidt, Julie A. Travis, Ruth C. Weiderpass, Elisabete Johansson, Mattias Dossus, Laure Jenab, Mazda Gunter, Marc J. Lorenzo Bermejo, Justo Scherer, Dominique Salek, Reza M. Keski-Rahkonen, Pekka Ferrari, Pietro Metabolites Article Pooling metabolomics data across studies is often desirable to increase the statistical power of the analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations and variability between datasets. Specifically, different studies may use variable sample types (e.g., serum versus plasma) collected, treated, and stored according to different protocols, and assayed in different laboratories using different instruments. To address these issues, a new pipeline was developed to normalize and pool metabolomics data through a set of sequential steps: (i) exclusions of the least informative observations and metabolites and removal of outliers; imputation of missing data; (ii) identification of the main sources of variability through principal component partial R-square (PC-PR2) analysis; (iii) application of linear mixed models to remove unwanted variability, including samples’ originating study and batch, and preserve biological variations while accounting for potential differences in the residual variances across studies. This pipeline was applied to targeted metabolomics data acquired using Biocrates AbsoluteIDQ kits in eight case-control studies nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. Comprehensive examination of metabolomics measurements indicated that the pipeline improved the comparability of data across the studies. Our pipeline can be adapted to normalize other molecular data, including biomarkers as well as proteomics data, and could be used for pooling molecular datasets, for example in international consortia, to limit biases introduced by inter-study variability. This versatility of the pipeline makes our work of potential interest to molecular epidemiologists. MDPI 2021-09-17 /pmc/articles/PMC8467830/ /pubmed/34564446 http://dx.doi.org/10.3390/metabo11090631 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Viallon, Vivian
His, Mathilde
Rinaldi, Sabina
Breeur, Marie
Gicquiau, Audrey
Hemon, Bertrand
Overvad, Kim
Tjønneland, Anne
Rostgaard-Hansen, Agnetha Linn
Rothwell, Joseph A.
Lecuyer, Lucie
Severi, Gianluca
Kaaks, Rudolf
Johnson, Theron
Schulze, Matthias B.
Palli, Domenico
Agnoli, Claudia
Panico, Salvatore
Tumino, Rosario
Ricceri, Fulvio
Verschuren, W. M. Monique
Engelfriet, Peter
Onland-Moret, Charlotte
Vermeulen, Roel
Nøst, Therese Haugdahl
Urbarova, Ilona
Zamora-Ros, Raul
Rodriguez-Barranco, Miguel
Amiano, Pilar
Huerta, José Maria
Ardanaz, Eva
Melander, Olle
Ottoson, Filip
Vidman, Linda
Rentoft, Matilda
Schmidt, Julie A.
Travis, Ruth C.
Weiderpass, Elisabete
Johansson, Mattias
Dossus, Laure
Jenab, Mazda
Gunter, Marc J.
Lorenzo Bermejo, Justo
Scherer, Dominique
Salek, Reza M.
Keski-Rahkonen, Pekka
Ferrari, Pietro
A New Pipeline for the Normalization and Pooling of Metabolomics Data
title A New Pipeline for the Normalization and Pooling of Metabolomics Data
title_full A New Pipeline for the Normalization and Pooling of Metabolomics Data
title_fullStr A New Pipeline for the Normalization and Pooling of Metabolomics Data
title_full_unstemmed A New Pipeline for the Normalization and Pooling of Metabolomics Data
title_short A New Pipeline for the Normalization and Pooling of Metabolomics Data
title_sort new pipeline for the normalization and pooling of metabolomics data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467830/
https://www.ncbi.nlm.nih.gov/pubmed/34564446
http://dx.doi.org/10.3390/metabo11090631
work_keys_str_mv AT viallonvivian anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT hismathilde anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rinaldisabina anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT breeurmarie anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT gicquiauaudrey anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT hemonbertrand anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT overvadkim anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT tjønnelandanne anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rostgaardhansenagnethalinn anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rothwelljosepha anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT lecuyerlucie anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT severigianluca anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT kaaksrudolf anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT johnsontheron anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schulzematthiasb anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT pallidomenico anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT agnoliclaudia anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT panicosalvatore anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT tuminorosario anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT riccerifulvio anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT verschurenwmmonique anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT engelfrietpeter anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT onlandmoretcharlotte anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT vermeulenroel anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT nøsttheresehaugdahl anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT urbarovailona anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT zamorarosraul anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rodriguezbarrancomiguel anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT amianopilar anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT huertajosemaria anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ardanazeva anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT melanderolle anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ottosonfilip anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT vidmanlinda anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rentoftmatilda anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schmidtjuliea anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT travisruthc anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT weiderpasselisabete anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT johanssonmattias anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT dossuslaure anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT jenabmazda anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT guntermarcj anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT lorenzobermejojusto anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schererdominique anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT salekrezam anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT keskirahkonenpekka anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ferraripietro anewpipelineforthenormalizationandpoolingofmetabolomicsdata
AT viallonvivian newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT hismathilde newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rinaldisabina newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT breeurmarie newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT gicquiauaudrey newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT hemonbertrand newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT overvadkim newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT tjønnelandanne newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rostgaardhansenagnethalinn newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rothwelljosepha newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT lecuyerlucie newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT severigianluca newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT kaaksrudolf newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT johnsontheron newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schulzematthiasb newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT pallidomenico newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT agnoliclaudia newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT panicosalvatore newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT tuminorosario newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT riccerifulvio newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT verschurenwmmonique newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT engelfrietpeter newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT onlandmoretcharlotte newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT vermeulenroel newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT nøsttheresehaugdahl newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT urbarovailona newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT zamorarosraul newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rodriguezbarrancomiguel newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT amianopilar newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT huertajosemaria newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ardanazeva newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT melanderolle newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ottosonfilip newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT vidmanlinda newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT rentoftmatilda newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schmidtjuliea newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT travisruthc newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT weiderpasselisabete newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT johanssonmattias newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT dossuslaure newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT jenabmazda newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT guntermarcj newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT lorenzobermejojusto newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT schererdominique newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT salekrezam newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT keskirahkonenpekka newpipelineforthenormalizationandpoolingofmetabolomicsdata
AT ferraripietro newpipelineforthenormalizationandpoolingofmetabolomicsdata