Cargando…

Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method

BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limi...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Ying, Zhang, Naixin, Mai, Yuanbang, Ren, Luyao, Chen, Qiaochu, Cao, Zehui, Chen, Qingwang, Liu, Yaqing, Hou, Wanwan, Yang, Jingcheng, Hong, Huixiao, Xu, Joshua, Tong, Weida, Dong, Lianhua, Shi, Leming, Fang, Xiang, Zheng, Yuanting
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483871/
https://www.ncbi.nlm.nih.gov/pubmed/37674217
http://dx.doi.org/10.1186/s13059-023-03047-z
_version_ 1785102479166275584
author Yu, Ying
Zhang, Naixin
Mai, Yuanbang
Ren, Luyao
Chen, Qiaochu
Cao, Zehui
Chen, Qingwang
Liu, Yaqing
Hou, Wanwan
Yang, Jingcheng
Hong, Huixiao
Xu, Joshua
Tong, Weida
Dong, Lianhua
Shi, Leming
Fang, Xiang
Zheng, Yuanting
author_facet Yu, Ying
Zhang, Naixin
Mai, Yuanbang
Ren, Luyao
Chen, Qiaochu
Cao, Zehui
Chen, Qingwang
Liu, Yaqing
Hou, Wanwan
Yang, Jingcheng
Hong, Huixiao
Xu, Joshua
Tong, Weida
Dong, Lianhua
Shi, Leming
Fang, Xiang
Zheng, Yuanting
author_sort Yu, Ying
collection PubMed
description BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03047-z.
format Online
Article
Text
id pubmed-10483871
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-104838712023-09-08 Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method Yu, Ying Zhang, Naixin Mai, Yuanbang Ren, Luyao Chen, Qiaochu Cao, Zehui Chen, Qingwang Liu, Yaqing Hou, Wanwan Yang, Jingcheng Hong, Huixiao Xu, Joshua Tong, Weida Dong, Lianhua Shi, Leming Fang, Xiang Zheng, Yuanting Genome Biol Research BACKGROUND: Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS: As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS: Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13059-023-03047-z. BioMed Central 2023-09-07 /pmc/articles/PMC10483871/ /pubmed/37674217 http://dx.doi.org/10.1186/s13059-023-03047-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Yu, Ying
Zhang, Naixin
Mai, Yuanbang
Ren, Luyao
Chen, Qiaochu
Cao, Zehui
Chen, Qingwang
Liu, Yaqing
Hou, Wanwan
Yang, Jingcheng
Hong, Huixiao
Xu, Joshua
Tong, Weida
Dong, Lianhua
Shi, Leming
Fang, Xiang
Zheng, Yuanting
Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title_full Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title_fullStr Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title_full_unstemmed Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title_short Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
title_sort correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10483871/
https://www.ncbi.nlm.nih.gov/pubmed/37674217
http://dx.doi.org/10.1186/s13059-023-03047-z
work_keys_str_mv AT yuying correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT zhangnaixin correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT maiyuanbang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT renluyao correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT chenqiaochu correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT caozehui correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT chenqingwang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT liuyaqing correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT houwanwan correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT yangjingcheng correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT honghuixiao correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT xujoshua correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT tongweida correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT donglianhua correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT shileming correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT fangxiang correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod
AT zhengyuanting correctingbatcheffectsinlargescalemultiomicsstudiesusingareferencematerialbasedratiomethod