Cargando…
Integrating omics datasets with the OmicsPLS package
BACKGROUND: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific vari...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6182835/ https://www.ncbi.nlm.nih.gov/pubmed/30309317 http://dx.doi.org/10.1186/s12859-018-2371-3 |
_version_ | 1783362656955006976 |
---|---|
author | Bouhaddani, Said el Uh, Hae-Won Jongbloed, Geurt Hayward, Caroline Klarić, Lucija Kiełbasa, Szymon M. Houwing-Duistermaat, Jeanine |
author_facet | Bouhaddani, Said el Uh, Hae-Won Jongbloed, Geurt Hayward, Caroline Klarić, Lucija Kiełbasa, Szymon M. Houwing-Duistermaat, Jeanine |
author_sort | Bouhaddani, Said el |
collection | PubMed |
description | BACKGROUND: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages(“OmicsPLS”). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2371-3) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6182835 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-61828352018-10-18 Integrating omics datasets with the OmicsPLS package Bouhaddani, Said el Uh, Hae-Won Jongbloed, Geurt Hayward, Caroline Klarić, Lucija Kiełbasa, Szymon M. Houwing-Duistermaat, Jeanine BMC Bioinformatics Software BACKGROUND: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages(“OmicsPLS”). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2371-3) contains supplementary material, which is available to authorized users. BioMed Central 2018-10-11 /pmc/articles/PMC6182835/ /pubmed/30309317 http://dx.doi.org/10.1186/s12859-018-2371-3 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Bouhaddani, Said el Uh, Hae-Won Jongbloed, Geurt Hayward, Caroline Klarić, Lucija Kiełbasa, Szymon M. Houwing-Duistermaat, Jeanine Integrating omics datasets with the OmicsPLS package |
title | Integrating omics datasets with the OmicsPLS package |
title_full | Integrating omics datasets with the OmicsPLS package |
title_fullStr | Integrating omics datasets with the OmicsPLS package |
title_full_unstemmed | Integrating omics datasets with the OmicsPLS package |
title_short | Integrating omics datasets with the OmicsPLS package |
title_sort | integrating omics datasets with the omicspls package |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6182835/ https://www.ncbi.nlm.nih.gov/pubmed/30309317 http://dx.doi.org/10.1186/s12859-018-2371-3 |
work_keys_str_mv | AT bouhaddanisaidel integratingomicsdatasetswiththeomicsplspackage AT uhhaewon integratingomicsdatasetswiththeomicsplspackage AT jongbloedgeurt integratingomicsdatasetswiththeomicsplspackage AT haywardcaroline integratingomicsdatasetswiththeomicsplspackage AT klariclucija integratingomicsdatasetswiththeomicsplspackage AT kiełbasaszymonm integratingomicsdatasetswiththeomicsplspackage AT houwingduistermaatjeanine integratingomicsdatasetswiththeomicsplspackage |