Cargando…

Quantifying Overfitting Potential in Drug Binding Datasets

In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a sli...

Descripción completa

Detalles Bibliográficos
Autores principales:	Davis, Brian, Mcloughlin, Kevin, Allen, Jonathan, Ellingson, Sally R.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304006/ http://dx.doi.org/10.1007/978-3-030-50420-5_44

_version_	1783548178129223680
author	Davis, Brian Mcloughlin, Kevin Allen, Jonathan Ellingson, Sally R.
author_facet	Davis, Brian Mcloughlin, Kevin Allen, Jonathan Ellingson, Sally R.
author_sort	Davis, Brian
collection	PubMed
description	In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a slightly revised version and introduce a new weighted metric. We find that the new metrics allow to quantify overfitting while not overly limiting training data and produce models with greater predictive value.
format	Online Article Text
id	pubmed-7304006
institution	National Center for Biotechnology Information
language	English
publishDate	2020
record_format	MEDLINE/PubMed
spelling	pubmed-73040062020-06-19 Quantifying Overfitting Potential in Drug Binding Datasets Davis, Brian Mcloughlin, Kevin Allen, Jonathan Ellingson, Sally R. Computational Science – ICCS 2020 Article In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a slightly revised version and introduce a new weighted metric. We find that the new metrics allow to quantify overfitting while not overly limiting training data and produce models with greater predictive value. 2020-05-22 /pmc/articles/PMC7304006/ http://dx.doi.org/10.1007/978-3-030-50420-5_44 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Davis, Brian Mcloughlin, Kevin Allen, Jonathan Ellingson, Sally R. Quantifying Overfitting Potential in Drug Binding Datasets
title	Quantifying Overfitting Potential in Drug Binding Datasets
title_full	Quantifying Overfitting Potential in Drug Binding Datasets
title_fullStr	Quantifying Overfitting Potential in Drug Binding Datasets
title_full_unstemmed	Quantifying Overfitting Potential in Drug Binding Datasets
title_short	Quantifying Overfitting Potential in Drug Binding Datasets
title_sort	quantifying overfitting potential in drug binding datasets
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7304006/ http://dx.doi.org/10.1007/978-3-030-50420-5_44
work_keys_str_mv	AT davisbrian quantifyingoverfittingpotentialindrugbindingdatasets AT mcloughlinkevin quantifyingoverfittingpotentialindrugbindingdatasets AT allenjonathan quantifyingoverfittingpotentialindrugbindingdatasets AT ellingsonsallyr quantifyingoverfittingpotentialindrugbindingdatasets

Quantifying Overfitting Potential in Drug Binding Datasets

Ejemplares similares