Cargando…

AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets

The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Zhiyue Tom, Ye, Yuting, Newbury, Patrick A., Huang, Haiyan, Chen, Bin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417811/
https://www.ncbi.nlm.nih.gov/pubmed/30864327
_version_ 1783403626791698432
author Hu, Zhiyue Tom
Ye, Yuting
Newbury, Patrick A.
Huang, Haiyan
Chen, Bin
author_facet Hu, Zhiyue Tom
Ye, Yuting
Newbury, Patrick A.
Huang, Haiyan
Chen, Bin
author_sort Hu, Zhiyue Tom
collection PubMed
description The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across different studies (drug-wise) is relatively low, while the pairwise sensitivity data correlation between cell-lines, or columns, across different studies (cell-wise) is considerably strong. This common interesting observation across multiple pharmacogenomics datasets suggests the existence of subtle consistency among the different studies (i.e., strong cell-wise correlation). However, significant noises are also shown (i.e., weak drug-wise correlation) and have prevented researchers from comfortably using the data directly. Motivated by this observation, we propose a novel framework for addressing the inconsistency between large-scale pharmacogenomics data sets. Our method can significantly boost the drug-wise correlation and can be easily applied to re-summarized and normalized datasets proposed by others. We also investigate our algorithm based on many different criteria to demonstrate that the corrected datasets are not only consistent, but also biologically meaningful. Eventually, we propose to extend our main algorithm into a framework, so that in the future when more datasets become publicly available, our framework can hopefully offer a “ground-truth” guidance for references.
format Online
Article
Text
id pubmed-6417811
institution National Center for Biotechnology Information
language English
publishDate 2019
record_format MEDLINE/PubMed
spelling pubmed-64178112019-03-14 AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets Hu, Zhiyue Tom Ye, Yuting Newbury, Patrick A. Huang, Haiyan Chen, Bin Pac Symp Biocomput Article The inconsistency of open pharmacogenomics datasets produced by different studies limits the usage of such datasets in many tasks, such as biomarker discovery. Investigation of multiple pharmacogenomics datasets confirmed that the pairwise sensitivity data correlation between drugs, or rows, across different studies (drug-wise) is relatively low, while the pairwise sensitivity data correlation between cell-lines, or columns, across different studies (cell-wise) is considerably strong. This common interesting observation across multiple pharmacogenomics datasets suggests the existence of subtle consistency among the different studies (i.e., strong cell-wise correlation). However, significant noises are also shown (i.e., weak drug-wise correlation) and have prevented researchers from comfortably using the data directly. Motivated by this observation, we propose a novel framework for addressing the inconsistency between large-scale pharmacogenomics data sets. Our method can significantly boost the drug-wise correlation and can be easily applied to re-summarized and normalized datasets proposed by others. We also investigate our algorithm based on many different criteria to demonstrate that the corrected datasets are not only consistent, but also biologically meaningful. Eventually, we propose to extend our main algorithm into a framework, so that in the future when more datasets become publicly available, our framework can hopefully offer a “ground-truth” guidance for references. 2019 /pmc/articles/PMC6417811/ /pubmed/30864327 Text en http://www.creativecommons.org/licenses/by-nc/3.0/ Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License.
spellingShingle Article
Hu, Zhiyue Tom
Ye, Yuting
Newbury, Patrick A.
Huang, Haiyan
Chen, Bin
AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title_full AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title_fullStr AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title_full_unstemmed AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title_short AICM: A Genuine Framework for Correcting Inconsistency Between Large Pharmacogenomics Datasets
title_sort aicm: a genuine framework for correcting inconsistency between large pharmacogenomics datasets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6417811/
https://www.ncbi.nlm.nih.gov/pubmed/30864327
work_keys_str_mv AT huzhiyuetom aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets
AT yeyuting aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets
AT newburypatricka aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets
AT huanghaiyan aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets
AT chenbin aicmagenuineframeworkforcorrectinginconsistencybetweenlargepharmacogenomicsdatasets