Cargando…

A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods

Very diverse research fields frequently deal with the analysis of multiple clustering results, which should imply an objective detection of overlaps and divergences between the formed groupings. The congruence between these multiple results can be quantified by clustering comparison measures such as...

Descripción completa

Detalles Bibliográficos
Autores principales: Pinto, Francisco R., Melo-Cristino, José, Ramirez, Mário
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2577298/
https://www.ncbi.nlm.nih.gov/pubmed/19002246
http://dx.doi.org/10.1371/journal.pone.0003696
_version_ 1782160476207054848
author Pinto, Francisco R.
Melo-Cristino, José
Ramirez, Mário
author_facet Pinto, Francisco R.
Melo-Cristino, José
Ramirez, Mário
author_sort Pinto, Francisco R.
collection PubMed
description Very diverse research fields frequently deal with the analysis of multiple clustering results, which should imply an objective detection of overlaps and divergences between the formed groupings. The congruence between these multiple results can be quantified by clustering comparison measures such as the Wallace coefficient (W). Since the measured congruence is dependent on the particular sample taken from the population, there is variability in the estimated values relatively to those of the true population. In the present work we propose the use of a confidence interval (CI) to account for this variability when W is used. The CI analytical formula is derived assuming a Gaussian sampling distribution and recurring to the algebraic relationship between W and the Simpson's index of diversity. This relationship also allows the estimation of the expected Wallace value under the assumption of independence of classifications. We evaluated the CI performance using simulated and published microbial typing data sets. The simulations showed that the CI has the desired 95% coverage when the W is greater than 0.5. This behaviour is robust to changes in cluster number, cluster size distributions and sample size. The analysis of the published data sets demonstrated the usefulness of the new CI by objectively validating some of the previous interpretations, while showing that other conclusions lacked statistical support.
format Text
id pubmed-2577298
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-25772982008-11-11 A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods Pinto, Francisco R. Melo-Cristino, José Ramirez, Mário PLoS One Research Article Very diverse research fields frequently deal with the analysis of multiple clustering results, which should imply an objective detection of overlaps and divergences between the formed groupings. The congruence between these multiple results can be quantified by clustering comparison measures such as the Wallace coefficient (W). Since the measured congruence is dependent on the particular sample taken from the population, there is variability in the estimated values relatively to those of the true population. In the present work we propose the use of a confidence interval (CI) to account for this variability when W is used. The CI analytical formula is derived assuming a Gaussian sampling distribution and recurring to the algebraic relationship between W and the Simpson's index of diversity. This relationship also allows the estimation of the expected Wallace value under the assumption of independence of classifications. We evaluated the CI performance using simulated and published microbial typing data sets. The simulations showed that the CI has the desired 95% coverage when the W is greater than 0.5. This behaviour is robust to changes in cluster number, cluster size distributions and sample size. The analysis of the published data sets demonstrated the usefulness of the new CI by objectively validating some of the previous interpretations, while showing that other conclusions lacked statistical support. Public Library of Science 2008-11-11 /pmc/articles/PMC2577298/ /pubmed/19002246 http://dx.doi.org/10.1371/journal.pone.0003696 Text en Pinto et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Pinto, Francisco R.
Melo-Cristino, José
Ramirez, Mário
A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title_full A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title_fullStr A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title_full_unstemmed A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title_short A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods
title_sort confidence interval for the wallace coefficient of concordance and its application to microbial typing methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2577298/
https://www.ncbi.nlm.nih.gov/pubmed/19002246
http://dx.doi.org/10.1371/journal.pone.0003696
work_keys_str_mv AT pintofranciscor aconfidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods
AT melocristinojose aconfidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods
AT ramirezmario aconfidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods
AT pintofranciscor confidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods
AT melocristinojose confidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods
AT ramirezmario confidenceintervalforthewallacecoefficientofconcordanceanditsapplicationtomicrobialtypingmethods