Cargando…

Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach

The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are...

Descripción completa

Detalles Bibliográficos
Autores principales: Nagelkerke, Nico, Fidler, Vaclav
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4608588/
https://www.ncbi.nlm.nih.gov/pubmed/26474313
http://dx.doi.org/10.1371/journal.pone.0140718
_version_ 1782395680132694016
author Nagelkerke, Nico
Fidler, Vaclav
author_facet Nagelkerke, Nico
Fidler, Vaclav
author_sort Nagelkerke, Nico
collection PubMed
description The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
format Online
Article
Text
id pubmed-4608588
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46085882015-10-29 Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach Nagelkerke, Nico Fidler, Vaclav PLoS One Research Article The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations. Public Library of Science 2015-10-16 /pmc/articles/PMC4608588/ /pubmed/26474313 http://dx.doi.org/10.1371/journal.pone.0140718 Text en © 2015 Nagelkerke, Fidler http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Nagelkerke, Nico
Fidler, Vaclav
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title_full Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title_fullStr Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title_full_unstemmed Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title_short Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach
title_sort estimating a logistic discrimination functions when one of the training samples is subject to misclassification: a maximum likelihood approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4608588/
https://www.ncbi.nlm.nih.gov/pubmed/26474313
http://dx.doi.org/10.1371/journal.pone.0140718
work_keys_str_mv AT nagelkerkenico estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach
AT fidlervaclav estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach