Cargando…

An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging

Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positiv...

Descripción completa

Detalles Bibliográficos
Autores principales:	Li, Jing, Zhang, Haowen, Dong, Yabo, Zuo, Tongbin, Xu, Duanqing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8587877/ https://www.ncbi.nlm.nih.gov/pubmed/34770721 http://dx.doi.org/10.3390/s21217414

_version_	1784598281460908032
author	Li, Jing Zhang, Haowen Dong, Yabo Zuo, Tongbin Xu, Duanqing
author_facet	Li, Jing Zhang, Haowen Dong, Yabo Zuo, Tongbin Xu, Duanqing
author_sort	Li, Jing
collection	PubMed
description	Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (PUTSC), which refers to automatically labelling the large unlabeled set U based on a small positive labeled set PL. The self-training (ST) is the most widely used method for solving the PUTSC problem and has attracted increased attention due to its simplicity and effectiveness. The existing ST methods simply employ the one-nearest-neighbor (1NN) formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the 1NN formula might not be optimal for PUTSC tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called ST-average. Unlike conventional ST-based approaches, ST-average utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in PL set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing ST-based methods. Besides, we demonstrate that ST-average can naturally be implemented along with many existing techniques used in original ST. Experimental results on public datasets show that ST-average performs better than related popular methods.
format	Online Article Text
id	pubmed-8587877
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-85878772021-11-13 An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging Li, Jing Zhang, Haowen Dong, Yabo Zuo, Tongbin Xu, Duanqing Sensors (Basel) Article Traditional supervised time series classification (TSC) tasks assume that all training data are labeled. However, in practice, manually labelling all unlabeled data could be very time-consuming and often requires the participation of skilled domain experts. In this paper, we concern with the positive unlabeled time series classification problem (PUTSC), which refers to automatically labelling the large unlabeled set U based on a small positive labeled set PL. The self-training (ST) is the most widely used method for solving the PUTSC problem and has attracted increased attention due to its simplicity and effectiveness. The existing ST methods simply employ the one-nearest-neighbor (1NN) formula to determine which unlabeled time-series should be labeled. Nevertheless, we note that the 1NN formula might not be optimal for PUTSC tasks because it may be sensitive to the initial labeled data located near the boundary between the positive and negative classes. To overcome this issue, in this paper we propose an exploratory methodology called ST-average. Unlike conventional ST-based approaches, ST-average utilizes the average sequence calculated by DTW barycenter averaging technique to label the data. Compared with any individuals in PL set, the average sequence is more representative. Our proposal is insensitive to the initial labeled data and is more reliable than existing ST-based methods. Besides, we demonstrate that ST-average can naturally be implemented along with many existing techniques used in original ST. Experimental results on public datasets show that ST-average performs better than related popular methods. MDPI 2021-11-08 /pmc/articles/PMC8587877/ /pubmed/34770721 http://dx.doi.org/10.3390/s21217414 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Li, Jing Zhang, Haowen Dong, Yabo Zuo, Tongbin Xu, Duanqing An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title	An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_full	An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_fullStr	An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_full_unstemmed	An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_short	An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging
title_sort	improved self-training method for positive unlabeled time series classification using dtw barycenter averaging
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8587877/ https://www.ncbi.nlm.nih.gov/pubmed/34770721 http://dx.doi.org/10.3390/s21217414
work_keys_str_mv	AT lijing animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT zhanghaowen animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT dongyabo animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT zuotongbin animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT xuduanqing animprovedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT lijing improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT zhanghaowen improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT dongyabo improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT zuotongbin improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging AT xuduanqing improvedselftrainingmethodforpositiveunlabeledtimeseriesclassificationusingdtwbarycenteraveraging

An Improved Self-Training Method for Positive Unlabeled Time Series Classification Using DTW Barycenter Averaging

Ejemplares similares