Cargando…
Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limi...
Autores principales: | , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483871/ https://www.ncbi.nlm.nih.gov/pubmed/37674217 http://dx.doi.org/10.1186/s13059-023-03047-z |
_version_ | 1785102479166275584 |
---|---|
author | Yu, Ying Zhang, Naixin Mai, Yuanbang Ren, Luyao Chen, Qiaochu Cao, Zehui Chen, Qingwang Liu, Yaqing Hou, Wanwan Yang, Jingcheng Hong, Huixiao Xu, Joshua Tong, Weida Dong, Lianhua Shi, Leming Fang, Xiang Zheng, Yuanting |
author_facet | Yu, Ying Zhang, Naixin Mai, Yuanbang Ren, Luyao Chen, Qiaochu Cao, Zehui Chen, Qingwang Liu, Yaqing Hou, Wanwan Yang, Jingcheng Hong, Huixiao Xu, Joshua Tong, Weida Dong, Lianhua Shi, Leming Fang, Xiang Zheng, Yuanting |
author_sort | Yu, Ying |
collection | PubMed |
description | BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03047-z. |
format | Online Article Text |
id | pubmed-10483871 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-104838712023-09-08 Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method Yu, Ying Zhang, Naixin Mai, Yuanbang Ren, Luyao Chen, Qiaochu Cao, Zehui Chen, Qingwang Liu, Yaqing Hou, Wanwan Yang, Jingcheng Hong, Huixiao Xu, Joshua Tong, Weida Dong, Lianhua Shi, Leming Fang, Xiang Zheng, Yuanting Genome Biol Research BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03047-z. BioMed Central 2023-09-07 /pmc/articles/PMC10483871/ /pubmed/37674217 http://dx.doi.org/10.1186/s13059-023-03047-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Yu, Ying Zhang, Naixin Mai, Yuanbang Ren, Luyao Chen, Qiaochu Cao, Zehui Chen, Qingwang Liu, Yaqing Hou, Wanwan Yang, Jingcheng Hong, Huixiao Xu, Joshua Tong, Weida Dong, Lianhua Shi, Leming Fang, Xiang Zheng, Yuanting Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title | Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title_full | Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title_fullStr | Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title_full_unstemmed | Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title_short | Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
title_sort | correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483871/ https://www.ncbi.nlm.nih.gov/pubmed/37674217 http://dx.doi.org/10.1186/s13059-023-03047-z |
work_keys_str_mv | AT yuying correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT zhangnaixin correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT maiyuanbang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT renluyao correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT chenqiaochu correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT caozehui correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT chenqingwang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT liuyaqing correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT houwanwan correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT yangjingcheng correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT honghuixiao correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT xujoshua correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT tongweida correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT donglianhua correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT shileming correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT fangxiang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod AT zhengyuanting correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod |