Cargando…

Classifiers and their Metrics Quantified

Molecular modeling frequently constructs classification models for the prediction of two‐class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rat...

Descripción completa

Detalles Bibliográficos
Autor principal:	Brown, J. B.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2018
Materias:	Methods Corner
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838539/ https://www.ncbi.nlm.nih.gov/pubmed/29360259 http://dx.doi.org/10.1002/minf.201700127

_version_	1783304280496668672
author	Brown, J. B.
author_facet	Brown, J. B.
author_sort	Brown, J. B.
collection	PubMed
description	Molecular modeling frequently constructs classification models for the prediction of two‐class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation.
format	Online Article Text
id	pubmed-5838539
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-58385392018-03-12 Classifiers and their Metrics Quantified Brown, J. B. Mol Inform Methods Corner Molecular modeling frequently constructs classification models for the prediction of two‐class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation. John Wiley and Sons Inc. 2018-01-23 2018-01 /pmc/articles/PMC5838539/ /pubmed/29360259 http://dx.doi.org/10.1002/minf.201700127 Text en © 2018 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial (http://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle	Methods Corner Brown, J. B. Classifiers and their Metrics Quantified
title	Classifiers and their Metrics Quantified
title_full	Classifiers and their Metrics Quantified
title_fullStr	Classifiers and their Metrics Quantified
title_full_unstemmed	Classifiers and their Metrics Quantified
title_short	Classifiers and their Metrics Quantified
title_sort	classifiers and their metrics quantified
topic	Methods Corner
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5838539/ https://www.ncbi.nlm.nih.gov/pubmed/29360259 http://dx.doi.org/10.1002/minf.201700127
work_keys_str_mv	AT brownjb classifiersandtheirmetricsquantified

Classifiers and their Metrics Quantified

Ejemplares similares