Cargando…

Machine Learning and Integrative Analysis of Biomedical Big Data

Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical an...

Descripción completa

Detalles Bibliográficos
Autores principales: Mirza, Bilal, Wang, Wei, Wang, Jie, Choi, Howard, Chung, Neo Christopher, Ping, Peipei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410075/
https://www.ncbi.nlm.nih.gov/pubmed/30696086
http://dx.doi.org/10.3390/genes10020087
_version_ 1783402149858770944
author Mirza, Bilal
Wang, Wei
Wang, Jie
Choi, Howard
Chung, Neo Christopher
Ping, Peipei
author_facet Mirza, Bilal
Wang, Wei
Wang, Jie
Choi, Howard
Chung, Neo Christopher
Ping, Peipei
author_sort Mirza, Bilal
collection PubMed
description Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
format Online
Article
Text
id pubmed-6410075
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-64100752019-03-26 Machine Learning and Integrative Analysis of Biomedical Big Data Mirza, Bilal Wang, Wei Wang, Jie Choi, Howard Chung, Neo Christopher Ping, Peipei Genes (Basel) Review Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues. MDPI 2019-01-28 /pmc/articles/PMC6410075/ /pubmed/30696086 http://dx.doi.org/10.3390/genes10020087 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Mirza, Bilal
Wang, Wei
Wang, Jie
Choi, Howard
Chung, Neo Christopher
Ping, Peipei
Machine Learning and Integrative Analysis of Biomedical Big Data
title Machine Learning and Integrative Analysis of Biomedical Big Data
title_full Machine Learning and Integrative Analysis of Biomedical Big Data
title_fullStr Machine Learning and Integrative Analysis of Biomedical Big Data
title_full_unstemmed Machine Learning and Integrative Analysis of Biomedical Big Data
title_short Machine Learning and Integrative Analysis of Biomedical Big Data
title_sort machine learning and integrative analysis of biomedical big data
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6410075/
https://www.ncbi.nlm.nih.gov/pubmed/30696086
http://dx.doi.org/10.3390/genes10020087
work_keys_str_mv AT mirzabilal machinelearningandintegrativeanalysisofbiomedicalbigdata
AT wangwei machinelearningandintegrativeanalysisofbiomedicalbigdata
AT wangjie machinelearningandintegrativeanalysisofbiomedicalbigdata
AT choihoward machinelearningandintegrativeanalysisofbiomedicalbigdata
AT chungneochristopher machinelearningandintegrativeanalysisofbiomedicalbigdata
AT pingpeipei machinelearningandintegrativeanalysisofbiomedicalbigdata