Cargando…

Discordant pair analysis for sample efficient model evaluation

We present a new technique for assessing the effectiveness of a classification algorithm using discordant pair analysis. This method utilizes a known performance baseline algorithm and a large unlabeled dataset with an assumed class distribution to obtain overall performance estimates by only assess...

Descripción completa

Detalles Bibliográficos
Autores principales: Musgrove, Donald, Radtke, Andrew, Haddad, Tarek
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10684559/
https://www.ncbi.nlm.nih.gov/pubmed/38017010
http://dx.doi.org/10.1038/s41598-023-48017-4
_version_ 1785151428207050752
author Musgrove, Donald
Radtke, Andrew
Haddad, Tarek
author_facet Musgrove, Donald
Radtke, Andrew
Haddad, Tarek
author_sort Musgrove, Donald
collection PubMed
description We present a new technique for assessing the effectiveness of a classification algorithm using discordant pair analysis. This method utilizes a known performance baseline algorithm and a large unlabeled dataset with an assumed class distribution to obtain overall performance estimates by only assessing the subset of examples that the algorithms classify discordantly. Our approach offers an efficient way to evaluate the performance of an algorithm that minimizes the human adjudications needed while also maintaining precision in the evaluation and in some cases improving the evaluation quality by reducing human adjudication errors. This approach is a computationally efficient alternative to the traditional exhaustive method of performance evaluation and has the potential to improve the accuracy of performance estimates. Simulation studies show that the discordant pair method reduces the number of adjudications by over 90%, while maintaining the same level of sensitivity and specificity.
format Online
Article
Text
id pubmed-10684559
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106845592023-11-30 Discordant pair analysis for sample efficient model evaluation Musgrove, Donald Radtke, Andrew Haddad, Tarek Sci Rep Article We present a new technique for assessing the effectiveness of a classification algorithm using discordant pair analysis. This method utilizes a known performance baseline algorithm and a large unlabeled dataset with an assumed class distribution to obtain overall performance estimates by only assessing the subset of examples that the algorithms classify discordantly. Our approach offers an efficient way to evaluate the performance of an algorithm that minimizes the human adjudications needed while also maintaining precision in the evaluation and in some cases improving the evaluation quality by reducing human adjudication errors. This approach is a computationally efficient alternative to the traditional exhaustive method of performance evaluation and has the potential to improve the accuracy of performance estimates. Simulation studies show that the discordant pair method reduces the number of adjudications by over 90%, while maintaining the same level of sensitivity and specificity. Nature Publishing Group UK 2023-11-28 /pmc/articles/PMC10684559/ /pubmed/38017010 http://dx.doi.org/10.1038/s41598-023-48017-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Musgrove, Donald
Radtke, Andrew
Haddad, Tarek
Discordant pair analysis for sample efficient model evaluation
title Discordant pair analysis for sample efficient model evaluation
title_full Discordant pair analysis for sample efficient model evaluation
title_fullStr Discordant pair analysis for sample efficient model evaluation
title_full_unstemmed Discordant pair analysis for sample efficient model evaluation
title_short Discordant pair analysis for sample efficient model evaluation
title_sort discordant pair analysis for sample efficient model evaluation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10684559/
https://www.ncbi.nlm.nih.gov/pubmed/38017010
http://dx.doi.org/10.1038/s41598-023-48017-4
work_keys_str_mv AT musgrovedonald discordantpairanalysisforsampleefficientmodelevaluation
AT radtkeandrew discordantpairanalysisforsampleefficientmodelevaluation
AT haddadtarek discordantpairanalysisforsampleefficientmodelevaluation