Cargando…
Normalization of Large-Scale Transcriptome Data Using Heuristic Methods
In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured u...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10068970/ https://www.ncbi.nlm.nih.gov/pubmed/37020503 http://dx.doi.org/10.1177/11779322231160397 |
_version_ | 1785018766440005632 |
---|---|
author | Yosef, Arthur Shnaider, Eli Schneider, Moti Gurevich, Michael |
author_facet | Yosef, Arthur Shnaider, Eli Schneider, Moti Gurevich, Michael |
author_sort | Yosef, Arthur |
collection | PubMed |
description | In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured under different conditions. While the data from the same batch (measurements performed under the same conditions) are compatible, combining various batches into 1 data set is problematic because of incompatible measurements. Therefore, it is necessary to perform correction of the combined data (normalization), before performing biological analysis. There are numerous methods attempting to correct data set for batch effect. These methods rely on various assumptions regarding the distribution of the measurements. Forcing the data elements into pre-supposed distribution can severely distort biological signals, thus leading to incorrect results and conclusions. As the discrepancy between the assumptions regarding the data distribution and the actual distribution is wider, the biases introduced by such “correction methods” are greater. We introduce a heuristic method to reduce batch effect. The method does not rely on any assumptions regarding the distribution and the behavior of data elements. Hence, it does not introduce any new biases in the process of correcting the batch effect. It strictly maintains the integrity of measurements within the original batches. |
format | Online Article Text |
id | pubmed-10068970 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-100689702023-04-04 Normalization of Large-Scale Transcriptome Data Using Heuristic Methods Yosef, Arthur Shnaider, Eli Schneider, Moti Gurevich, Michael Bioinform Biol Insights Original Research Article In this study, we introduce an artificial intelligent method for addressing the batch effect of a transcriptome data. The method has several clear advantages in comparison with the alternative methods presently in use. Batch effect refers to the discrepancy in gene expression data series, measured under different conditions. While the data from the same batch (measurements performed under the same conditions) are compatible, combining various batches into 1 data set is problematic because of incompatible measurements. Therefore, it is necessary to perform correction of the combined data (normalization), before performing biological analysis. There are numerous methods attempting to correct data set for batch effect. These methods rely on various assumptions regarding the distribution of the measurements. Forcing the data elements into pre-supposed distribution can severely distort biological signals, thus leading to incorrect results and conclusions. As the discrepancy between the assumptions regarding the data distribution and the actual distribution is wider, the biases introduced by such “correction methods” are greater. We introduce a heuristic method to reduce batch effect. The method does not rely on any assumptions regarding the distribution and the behavior of data elements. Hence, it does not introduce any new biases in the process of correcting the batch effect. It strictly maintains the integrity of measurements within the original batches. SAGE Publications 2023-03-31 /pmc/articles/PMC10068970/ /pubmed/37020503 http://dx.doi.org/10.1177/11779322231160397 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by-nc/4.0/This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Original Research Article Yosef, Arthur Shnaider, Eli Schneider, Moti Gurevich, Michael Normalization of Large-Scale Transcriptome Data Using Heuristic Methods |
title | Normalization of Large-Scale Transcriptome Data Using Heuristic
Methods |
title_full | Normalization of Large-Scale Transcriptome Data Using Heuristic
Methods |
title_fullStr | Normalization of Large-Scale Transcriptome Data Using Heuristic
Methods |
title_full_unstemmed | Normalization of Large-Scale Transcriptome Data Using Heuristic
Methods |
title_short | Normalization of Large-Scale Transcriptome Data Using Heuristic
Methods |
title_sort | normalization of large-scale transcriptome data using heuristic
methods |
topic | Original Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10068970/ https://www.ncbi.nlm.nih.gov/pubmed/37020503 http://dx.doi.org/10.1177/11779322231160397 |
work_keys_str_mv | AT yosefarthur normalizationoflargescaletranscriptomedatausingheuristicmethods AT shnaidereli normalizationoflargescaletranscriptomedatausingheuristicmethods AT schneidermoti normalizationoflargescaletranscriptomedatausingheuristicmethods AT gurevichmichael normalizationoflargescaletranscriptomedatausingheuristicmethods |