Cargando…

Area under Precision-Recall Curves for Weighted and Unweighted Data

Precision-recall curves are highly informative about the performance of binary classifiers, and the area under these curves is a popular scalar performance measure for comparing different classifiers. However, for many applications class labels are not provided with absolute certainty, but with some...

Descripción completa

Detalles Bibliográficos
Autores principales: Keilwagen, Jens, Grosse, Ivo, Grau, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3961324/
https://www.ncbi.nlm.nih.gov/pubmed/24651729
http://dx.doi.org/10.1371/journal.pone.0092209
_version_ 1782308278285369344
author Keilwagen, Jens
Grosse, Ivo
Grau, Jan
author_facet Keilwagen, Jens
Grosse, Ivo
Grau, Jan
author_sort Keilwagen, Jens
collection PubMed
description Precision-recall curves are highly informative about the performance of binary classifiers, and the area under these curves is a popular scalar performance measure for comparing different classifiers. However, for many applications class labels are not provided with absolute certainty, but with some degree of confidence, often reflected by weights or soft labels assigned to data points. Computing the area under the precision-recall curve requires interpolating between adjacent supporting points, but previous interpolation schemes are not directly applicable to weighted data. Hence, even in cases where weights were available, they had to be neglected for assessing classifiers using precision-recall curves. Here, we propose an interpolation for precision-recall curves that can also be used for weighted data, and we derive conditions for classification scores yielding the maximum and minimum area under the precision-recall curve. We investigate accordances and differences of the proposed interpolation and previous ones, and we demonstrate that taking into account existing weights of test data is important for the comparison of classifiers.
format Online
Article
Text
id pubmed-3961324
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39613242014-03-27 Area under Precision-Recall Curves for Weighted and Unweighted Data Keilwagen, Jens Grosse, Ivo Grau, Jan PLoS One Research Article Precision-recall curves are highly informative about the performance of binary classifiers, and the area under these curves is a popular scalar performance measure for comparing different classifiers. However, for many applications class labels are not provided with absolute certainty, but with some degree of confidence, often reflected by weights or soft labels assigned to data points. Computing the area under the precision-recall curve requires interpolating between adjacent supporting points, but previous interpolation schemes are not directly applicable to weighted data. Hence, even in cases where weights were available, they had to be neglected for assessing classifiers using precision-recall curves. Here, we propose an interpolation for precision-recall curves that can also be used for weighted data, and we derive conditions for classification scores yielding the maximum and minimum area under the precision-recall curve. We investigate accordances and differences of the proposed interpolation and previous ones, and we demonstrate that taking into account existing weights of test data is important for the comparison of classifiers. Public Library of Science 2014-03-20 /pmc/articles/PMC3961324/ /pubmed/24651729 http://dx.doi.org/10.1371/journal.pone.0092209 Text en © 2014 Keilwagen et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Keilwagen, Jens
Grosse, Ivo
Grau, Jan
Area under Precision-Recall Curves for Weighted and Unweighted Data
title Area under Precision-Recall Curves for Weighted and Unweighted Data
title_full Area under Precision-Recall Curves for Weighted and Unweighted Data
title_fullStr Area under Precision-Recall Curves for Weighted and Unweighted Data
title_full_unstemmed Area under Precision-Recall Curves for Weighted and Unweighted Data
title_short Area under Precision-Recall Curves for Weighted and Unweighted Data
title_sort area under precision-recall curves for weighted and unweighted data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3961324/
https://www.ncbi.nlm.nih.gov/pubmed/24651729
http://dx.doi.org/10.1371/journal.pone.0092209
work_keys_str_mv AT keilwagenjens areaunderprecisionrecallcurvesforweightedandunweighteddata
AT grosseivo areaunderprecisionrecallcurvesforweightedandunweighteddata
AT graujan areaunderprecisionrecallcurvesforweightedandunweighteddata