Cargando…
TiMEG: an integrative statistical method for partially missing multi-omics data
Multi-omics data integration is widely used to understand the genetic architecture of disease. In multi-omics association analysis, data collected on multiple omics for the same set of individuals are immensely important for biomarker identification. But when the sample size of such data is limited,...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8674330/ https://www.ncbi.nlm.nih.gov/pubmed/34911979 http://dx.doi.org/10.1038/s41598-021-03034-z |
_version_ | 1784615628175310848 |
---|---|
author | Das, Sarmistha Mukhopadhyay, Indranil |
author_facet | Das, Sarmistha Mukhopadhyay, Indranil |
author_sort | Das, Sarmistha |
collection | PubMed |
description | Multi-omics data integration is widely used to understand the genetic architecture of disease. In multi-omics association analysis, data collected on multiple omics for the same set of individuals are immensely important for biomarker identification. But when the sample size of such data is limited, the presence of partially missing individual-level observations poses a major challenge in data integration. More often, genotype data are available for all individuals under study but gene expression and/or methylation information are missing for different subsets of those individuals. Here, we develop a statistical model TiMEG, for the identification of disease-associated biomarkers in a case–control paradigm by integrating the above-mentioned data types, especially, in presence of missing omics data. Based on a likelihood approach, TiMEG exploits the inter-relationship among multiple omics data to capture weaker signals, that remain unidentified in single-omic analysis or common imputation-based methods. Its application on a real tuberous sclerosis dataset identified functionally relevant genes in the disease pathway. |
format | Online Article Text |
id | pubmed-8674330 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-86743302021-12-20 TiMEG: an integrative statistical method for partially missing multi-omics data Das, Sarmistha Mukhopadhyay, Indranil Sci Rep Article Multi-omics data integration is widely used to understand the genetic architecture of disease. In multi-omics association analysis, data collected on multiple omics for the same set of individuals are immensely important for biomarker identification. But when the sample size of such data is limited, the presence of partially missing individual-level observations poses a major challenge in data integration. More often, genotype data are available for all individuals under study but gene expression and/or methylation information are missing for different subsets of those individuals. Here, we develop a statistical model TiMEG, for the identification of disease-associated biomarkers in a case–control paradigm by integrating the above-mentioned data types, especially, in presence of missing omics data. Based on a likelihood approach, TiMEG exploits the inter-relationship among multiple omics data to capture weaker signals, that remain unidentified in single-omic analysis or common imputation-based methods. Its application on a real tuberous sclerosis dataset identified functionally relevant genes in the disease pathway. Nature Publishing Group UK 2021-12-15 /pmc/articles/PMC8674330/ /pubmed/34911979 http://dx.doi.org/10.1038/s41598-021-03034-z Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Das, Sarmistha Mukhopadhyay, Indranil TiMEG: an integrative statistical method for partially missing multi-omics data |
title | TiMEG: an integrative statistical method for partially missing multi-omics data |
title_full | TiMEG: an integrative statistical method for partially missing multi-omics data |
title_fullStr | TiMEG: an integrative statistical method for partially missing multi-omics data |
title_full_unstemmed | TiMEG: an integrative statistical method for partially missing multi-omics data |
title_short | TiMEG: an integrative statistical method for partially missing multi-omics data |
title_sort | timeg: an integrative statistical method for partially missing multi-omics data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8674330/ https://www.ncbi.nlm.nih.gov/pubmed/34911979 http://dx.doi.org/10.1038/s41598-021-03034-z |
work_keys_str_mv | AT dassarmistha timeganintegrativestatisticalmethodforpartiallymissingmultiomicsdata AT mukhopadhyayindranil timeganintegrativestatisticalmethodforpartiallymissingmultiomicsdata |