Cargando…
Relationship between gene regulation network structure and prediction accuracy in high dimensional regression
The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure,...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169869/ https://www.ncbi.nlm.nih.gov/pubmed/34075095 http://dx.doi.org/10.1038/s41598-021-90791-6 |
_version_ | 1783702114547007488 |
---|---|
author | Okinaga, Yuichi Kyogoku, Daisuke Kondo, Satoshi Nagano, Atsushi J. Hirose, Kei |
author_facet | Okinaga, Yuichi Kyogoku, Daisuke Kondo, Satoshi Nagano, Atsushi J. Hirose, Kei |
author_sort | Okinaga, Yuichi |
collection | PubMed |
description | The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure, which is characterized by gene regulation networks. However, the manner in which the structure of a gene regulation network together with the sample size affects prediction accuracy has not yet been sufficiently investigated. In this study, Monte Carlo simulations are conducted to investigate the prediction accuracy for several network structures under various sample sizes. When the gene regulation network is a random graph, a sufficiently large number of observations are required to ensure good prediction accuracy with the lasso. The PCR provided poor prediction accuracy regardless of the sample size. However, a real gene regulation network is likely to exhibit a scale-free structure. In such cases, the simulation indicates that a relatively small number of observations, such as [Formula: see text] , is sufficient to allow the accurate prediction of traits from a transcriptome with the lasso. |
format | Online Article Text |
id | pubmed-8169869 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-81698692021-06-03 Relationship between gene regulation network structure and prediction accuracy in high dimensional regression Okinaga, Yuichi Kyogoku, Daisuke Kondo, Satoshi Nagano, Atsushi J. Hirose, Kei Sci Rep Article The least absolute shrinkage and selection operator (lasso) and principal component regression (PCR) are popular methods of estimating traits from high-dimensional omics data, such as transcriptomes. The prediction accuracy of these estimation methods is highly dependent on the covariance structure, which is characterized by gene regulation networks. However, the manner in which the structure of a gene regulation network together with the sample size affects prediction accuracy has not yet been sufficiently investigated. In this study, Monte Carlo simulations are conducted to investigate the prediction accuracy for several network structures under various sample sizes. When the gene regulation network is a random graph, a sufficiently large number of observations are required to ensure good prediction accuracy with the lasso. The PCR provided poor prediction accuracy regardless of the sample size. However, a real gene regulation network is likely to exhibit a scale-free structure. In such cases, the simulation indicates that a relatively small number of observations, such as [Formula: see text] , is sufficient to allow the accurate prediction of traits from a transcriptome with the lasso. Nature Publishing Group UK 2021-06-01 /pmc/articles/PMC8169869/ /pubmed/34075095 http://dx.doi.org/10.1038/s41598-021-90791-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Okinaga, Yuichi Kyogoku, Daisuke Kondo, Satoshi Nagano, Atsushi J. Hirose, Kei Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title | Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title_full | Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title_fullStr | Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title_full_unstemmed | Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title_short | Relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
title_sort | relationship between gene regulation network structure and prediction accuracy in high dimensional regression |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8169869/ https://www.ncbi.nlm.nih.gov/pubmed/34075095 http://dx.doi.org/10.1038/s41598-021-90791-6 |
work_keys_str_mv | AT okinagayuichi relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression AT kyogokudaisuke relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression AT kondosatoshi relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression AT naganoatsushij relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression AT hirosekei relationshipbetweengeneregulationnetworkstructureandpredictionaccuracyinhighdimensionalregression |