Cargando…

Statistical Workflow for Feature Selection in Human Metabolomics Data

High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Antonelli, Joseph, Claggett, Brian L., Henglin, Mir, Kim, Andy, Ovsak, Gavin, Kim, Nicole, Deng, Katherine, Rao, Kevin, Tyagi, Octavia, Watrous, Jeramie D., Lagerborg, Kim A., Hushcha, Pavel V., Demler, Olga V., Mora, Samia, Niiranen, Teemu J., Pereira, Alexandre C., Jain, Mohit, Cheng, Susan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6680705/
https://www.ncbi.nlm.nih.gov/pubmed/31336989
http://dx.doi.org/10.3390/metabo9070143
_version_ 1783441561974996992
author Antonelli, Joseph
Claggett, Brian L.
Henglin, Mir
Kim, Andy
Ovsak, Gavin
Kim, Nicole
Deng, Katherine
Rao, Kevin
Tyagi, Octavia
Watrous, Jeramie D.
Lagerborg, Kim A.
Hushcha, Pavel V.
Demler, Olga V.
Mora, Samia
Niiranen, Teemu J.
Pereira, Alexandre C.
Jain, Mohit
Cheng, Susan
author_facet Antonelli, Joseph
Claggett, Brian L.
Henglin, Mir
Kim, Andy
Ovsak, Gavin
Kim, Nicole
Deng, Katherine
Rao, Kevin
Tyagi, Octavia
Watrous, Jeramie D.
Lagerborg, Kim A.
Hushcha, Pavel V.
Demler, Olga V.
Mora, Samia
Niiranen, Teemu J.
Pereira, Alexandre C.
Jain, Mohit
Cheng, Susan
author_sort Antonelli, Joseph
collection PubMed
description High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations.
format Online
Article
Text
id pubmed-6680705
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-66807052019-08-09 Statistical Workflow for Feature Selection in Human Metabolomics Data Antonelli, Joseph Claggett, Brian L. Henglin, Mir Kim, Andy Ovsak, Gavin Kim, Nicole Deng, Katherine Rao, Kevin Tyagi, Octavia Watrous, Jeramie D. Lagerborg, Kim A. Hushcha, Pavel V. Demler, Olga V. Mora, Samia Niiranen, Teemu J. Pereira, Alexandre C. Jain, Mohit Cheng, Susan Metabolites Review High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations. MDPI 2019-07-12 /pmc/articles/PMC6680705/ /pubmed/31336989 http://dx.doi.org/10.3390/metabo9070143 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Antonelli, Joseph
Claggett, Brian L.
Henglin, Mir
Kim, Andy
Ovsak, Gavin
Kim, Nicole
Deng, Katherine
Rao, Kevin
Tyagi, Octavia
Watrous, Jeramie D.
Lagerborg, Kim A.
Hushcha, Pavel V.
Demler, Olga V.
Mora, Samia
Niiranen, Teemu J.
Pereira, Alexandre C.
Jain, Mohit
Cheng, Susan
Statistical Workflow for Feature Selection in Human Metabolomics Data
title Statistical Workflow for Feature Selection in Human Metabolomics Data
title_full Statistical Workflow for Feature Selection in Human Metabolomics Data
title_fullStr Statistical Workflow for Feature Selection in Human Metabolomics Data
title_full_unstemmed Statistical Workflow for Feature Selection in Human Metabolomics Data
title_short Statistical Workflow for Feature Selection in Human Metabolomics Data
title_sort statistical workflow for feature selection in human metabolomics data
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6680705/
https://www.ncbi.nlm.nih.gov/pubmed/31336989
http://dx.doi.org/10.3390/metabo9070143
work_keys_str_mv AT antonellijoseph statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT claggettbrianl statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT henglinmir statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT kimandy statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT ovsakgavin statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT kimnicole statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT dengkatherine statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT raokevin statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT tyagioctavia statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT watrousjeramied statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT lagerborgkima statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT hushchapavelv statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT demlerolgav statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT morasamia statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT niiranenteemuj statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT pereiraalexandrec statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT jainmohit statisticalworkflowforfeatureselectioninhumanmetabolomicsdata
AT chengsusan statisticalworkflowforfeatureselectioninhumanmetabolomicsdata