Cargando…

Integrating Incompatible Assay Data Sets with Deep Preference Learning

[Image: see text] A large amount of bioactivity assay data is already accumulated in public databases, but the integration of these data sets for quantitative structure–activity relationship (QSAR) studies is not straightforward due to differences in experimental methods and settings. We present an...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Xiaolin, Tamura, Ryo, Sumita, Masato, Mori, Kenichi, Terayama, Kei, Tsuda, Koji
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2021
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8762726/
https://www.ncbi.nlm.nih.gov/pubmed/35047110
http://dx.doi.org/10.1021/acsmedchemlett.1c00439
Descripción
Sumario:[Image: see text] A large amount of bioactivity assay data is already accumulated in public databases, but the integration of these data sets for quantitative structure–activity relationship (QSAR) studies is not straightforward due to differences in experimental methods and settings. We present an efficient deep-learning-based approach called Deep Preference Data Integration (DPDI). For integrating outcome variables of different assay types, a surrogate variable is introduced, and a neural network is trained such that the total order induced by the surrogate variable is maximally consistent with given data sets. In a task of predicting efficacy of factor Xa inhibitors, DPDI successfully integrated 2959 molecules distributed in 129 assay data sets. In most of our experiments, data integration improved prediction accuracy strongly in interpolation and extrapolation tasks, indicating that DPDI is an effective tool for QSAR studies.