Cargando…
AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417811/ https://www.ncbi.nlm.nih.gov/pubmed/30864327 |
_version_ | 1783403626791698432 |
---|---|
author | Hu, Zhiyue Tom Ye, Yuting Newbury, Patrick A. Huang, Haiyan Chen, Bin |
author_facet | Hu, Zhiyue Tom Ye, Yuting Newbury, Patrick A. Huang, Haiyan Chen, Bin |
author_sort | Hu, Zhiyue Tom |
collection | PubMed |
description | The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across different studies (drug-wise) is relatively low, while the pairwise sensitivity data correlation between cell-lines, or columns, across different studies (cell-wise) is considerably strong. This common interesting observation across multiple pharmacogenomics datasets suggests the existence of subtle consistency among the different studies (i.e., strong cell-wise correlation). However, significant noises are also shown (i.e., weak drug-wise correlation) and have prevented researchers from comfortably using the data directly. Motivated by this observation, we propose a novel framework for addressing the inconsistency between large-scale pharmacogenomics data sets. Our method can significantly boost the drug-wise correlation and can be easily applied to re-summarized and normalized datasets proposed by others. We also investigate our algorithm based on many different criteria to demonstrate that the corrected datasets are not only consistent, but also biologically meaningful. Eventually, we propose to extend our main algorithm into a framework, so that in the future when more datasets become publicly available, our framework can hopefully offer a “ground-truth” guidance for references. |
format | Online Article Text |
id | pubmed-6417811 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
record_format | MEDLINE/PubMed |
spelling | pubmed-64178112019-03-14 AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets Hu, Zhiyue Tom Ye, Yuting Newbury, Patrick A. Huang, Haiyan Chen, Bin Pac Symp Biocomput Article The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across different studies (drug-wise) is relatively low, while the pairwise sensitivity data correlation between cell-lines, or columns, across different studies (cell-wise) is considerably strong. This common interesting observation across multiple pharmacogenomics datasets suggests the existence of subtle consistency among the different studies (i.e., strong cell-wise correlation). However, significant noises are also shown (i.e., weak drug-wise correlation) and have prevented researchers from comfortably using the data directly. Motivated by this observation, we propose a novel framework for addressing the inconsistency between large-scale pharmacogenomics data sets. Our method can significantly boost the drug-wise correlation and can be easily applied to re-summarized and normalized datasets proposed by others. We also investigate our algorithm based on many different criteria to demonstrate that the corrected datasets are not only consistent, but also biologically meaningful. Eventually, we propose to extend our main algorithm into a framework, so that in the future when more datasets become publicly available, our framework can hopefully offer a “ground-truth” guidance for references. 2019 /pmc/articles/PMC6417811/ /pubmed/30864327 Text en http://www.creativecommons.org/licenses/by-nc/3.0/ Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License. |
spellingShingle | Article Hu, Zhiyue Tom Ye, Yuting Newbury, Patrick A. Huang, Haiyan Chen, Bin AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title | AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title_full | AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title_fullStr | AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title_full_unstemmed | AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title_short | AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets |
title_sort | aicm: a genuine framework for correcting inconsistency between large pharmacogenomics datasets |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417811/ https://www.ncbi.nlm.nih.gov/pubmed/30864327 |
work_keys_str_mv | AT huzhiyuetom aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets AT yeyuting aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets AT newburypatricka aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets AT huanghaiyan aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets AT chenbin aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets |