Cargando…

Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)

Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Irene Sui Lan, Lumley, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824897/
https://www.ncbi.nlm.nih.gov/pubmed/29497285
http://dx.doi.org/10.1177/1177932218759292
_version_ 1783302103079321600
author Zeng, Irene Sui Lan
Lumley, Thomas
author_facet Zeng, Irene Sui Lan
Lumley, Thomas
author_sort Zeng, Irene Sui Lan
collection PubMed
description Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
format Online
Article
Text
id pubmed-5824897
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-58248972018-03-01 Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science) Zeng, Irene Sui Lan Lumley, Thomas Bioinform Biol Insights Review Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix. SAGE Publications 2018-02-20 /pmc/articles/PMC5824897/ /pubmed/29497285 http://dx.doi.org/10.1177/1177932218759292 Text en © The Author(s) 2018 http://www.creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Review
Zeng, Irene Sui Lan
Lumley, Thomas
Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title_full Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title_fullStr Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title_full_unstemmed Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title_short Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science)
title_sort review of statistical learning methods in integrated omics studies (an integrated information science)
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5824897/
https://www.ncbi.nlm.nih.gov/pubmed/29497285
http://dx.doi.org/10.1177/1177932218759292
work_keys_str_mv AT zengirenesuilan reviewofstatisticallearningmethodsinintegratedomicsstudiesanintegratedinformationscience
AT lumleythomas reviewofstatisticallearningmethodsinintegratedomicsstudiesanintegratedinformationscience