Cargando…

Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications

Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an importa...

Descripción completa

Detalles Bibliográficos
Autores principales:	Song, Hoseung, Thiagarajan, Jayaraman J., Kailkhura, Bhavya
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8223254/ https://www.ncbi.nlm.nih.gov/pubmed/34179767 http://dx.doi.org/10.3389/frai.2021.589632

_version_	1783711655714095104
author	Song, Hoseung Thiagarajan, Jayaraman J. Kailkhura, Bhavya
author_facet	Song, Hoseung Thiagarajan, Jayaraman J. Kailkhura, Bhavya
author_sort	Song, Hoseung
collection	PubMed
description	Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an important research issue in machine learning. Though several existing efforts have focused on image/video data, applications with graph-structured data have not received sufficient attention. Therefore, in this paper, we investigate the problem of detecting shifts in graph structured data through the lens of statistical hypothesis testing. Specifically, we propose a practical two-sample test based approach for shift detection in large-scale graph structured data. Our approach is very flexible in that it is suitable for both undirected and directed graphs, and eliminates the need for equal sample sizes. Using empirical studies, we demonstrate the effectiveness of the proposed test in detecting dataset shifts. We also corroborate these findings using real-world datasets, characterized by directed graphs and a large number of nodes.
format	Online Article Text
id	pubmed-8223254
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-82232542021-06-25 Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications Song, Hoseung Thiagarajan, Jayaraman J. Kailkhura, Bhavya Front Artif Intell Artificial Intelligence Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an important research issue in machine learning. Though several existing efforts have focused on image/video data, applications with graph-structured data have not received sufficient attention. Therefore, in this paper, we investigate the problem of detecting shifts in graph structured data through the lens of statistical hypothesis testing. Specifically, we propose a practical two-sample test based approach for shift detection in large-scale graph structured data. Our approach is very flexible in that it is suitable for both undirected and directed graphs, and eliminates the need for equal sample sizes. Using empirical studies, we demonstrate the effectiveness of the proposed test in detecting dataset shifts. We also corroborate these findings using real-world datasets, characterized by directed graphs and a large number of nodes. Frontiers Media S.A. 2021-05-18 /pmc/articles/PMC8223254/ /pubmed/34179767 http://dx.doi.org/10.3389/frai.2021.589632 Text en Copyright © 2021 Song, Thiagarajan and Kailkhura. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Song, Hoseung Thiagarajan, Jayaraman J. Kailkhura, Bhavya Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title	Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title_full	Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title_fullStr	Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title_full_unstemmed	Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title_short	Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications
title_sort	preventing failures by dataset shift detection in safety-critical graph applications
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8223254/ https://www.ncbi.nlm.nih.gov/pubmed/34179767 http://dx.doi.org/10.3389/frai.2021.589632
work_keys_str_mv	AT songhoseung preventingfailuresbydatasetshiftdetectioninsafetycriticalgraphapplications AT thiagarajanjayaramanj preventingfailuresbydatasetshiftdetectioninsafetycriticalgraphapplications AT kailkhurabhavya preventingfailuresbydatasetshiftdetectioninsafetycriticalgraphapplications

Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications

Ejemplares similares