Cargando…

Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables

In probability theory and statistics, the probability distribution of the sum of two or more independent and identically distributed (i.i.d.) random variables is the convolution of their individual distributions. While convoluting random variables following a binomial, geometric or Poisson distribut...

Descripción completa

Detalles Bibliográficos
Autores principales:	Johannssen, Arne, Chukhrova, Nataliya, Castagliola, Philippe
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier 2021
Materias:	Method Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8563477/ https://www.ncbi.nlm.nih.gov/pubmed/34754778 http://dx.doi.org/10.1016/j.mex.2021.101507

_version_	1784593414956777472
author	Johannssen, Arne Chukhrova, Nataliya Castagliola, Philippe
author_facet	Johannssen, Arne Chukhrova, Nataliya Castagliola, Philippe
author_sort	Johannssen, Arne
collection	PubMed
description	In probability theory and statistics, the probability distribution of the sum of two or more independent and identically distributed (i.i.d.) random variables is the convolution of their individual distributions. While convoluting random variables following a binomial, geometric or Poisson distribution is a straightforward procedure, convoluting hypergeometric-distributed random variables is not. The problem is that there is no closed form solution for the probability mass function (p.m.f.) and cumulative distribution function (c.d.f.) of the sum of i.i.d. hypergeometric random variables. To overcome this problem, we propose an approximation for the distribution of the sum of i.i.d. hypergeometric random variables. In addition, we compare this approximation with two classical numerical methods, i.e., convolution and the recursive algorithm by De Pril, by means of an application in Statistical Process Monitoring (SPM). We provide MATLAB codes to implement these three methods for computing the probability distribution of the sum of i.i.d. hypergeometric random variables in an efficient way. The obtained results show that the proposed approximation has remarkable properties and may be helpful in all fields, where the problem of convoluting hypergeometric-distributed random variables occurs. Therefore, the approximation considered in this paper is well suited to make a change over established practices. • This article presents theoretical bases of three methods for determining the probability distribution of the sum of i.i.d. hypergeometric random variables: (1) direct convolution, (2) recursive algorithm by De Pril, (3) approximation. • We provide associated MATLAB codes (including context-specific customizations) for direct implementation of these methods and discuss technical aspects and essential details of the tweaks we have made. • A representative application example in SPM shows that the proposed approximation is considerably simpler in application than both other methods and it ensures a remarkable high accuracy of the results while reducing computational time considerably.
format	Online Article Text
id	pubmed-8563477
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Elsevier
record_format	MEDLINE/PubMed
spelling	pubmed-85634772021-11-08 Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables Johannssen, Arne Chukhrova, Nataliya Castagliola, Philippe MethodsX Method Article In probability theory and statistics, the probability distribution of the sum of two or more independent and identically distributed (i.i.d.) random variables is the convolution of their individual distributions. While convoluting random variables following a binomial, geometric or Poisson distribution is a straightforward procedure, convoluting hypergeometric-distributed random variables is not. The problem is that there is no closed form solution for the probability mass function (p.m.f.) and cumulative distribution function (c.d.f.) of the sum of i.i.d. hypergeometric random variables. To overcome this problem, we propose an approximation for the distribution of the sum of i.i.d. hypergeometric random variables. In addition, we compare this approximation with two classical numerical methods, i.e., convolution and the recursive algorithm by De Pril, by means of an application in Statistical Process Monitoring (SPM). We provide MATLAB codes to implement these three methods for computing the probability distribution of the sum of i.i.d. hypergeometric random variables in an efficient way. The obtained results show that the proposed approximation has remarkable properties and may be helpful in all fields, where the problem of convoluting hypergeometric-distributed random variables occurs. Therefore, the approximation considered in this paper is well suited to make a change over established practices. • This article presents theoretical bases of three methods for determining the probability distribution of the sum of i.i.d. hypergeometric random variables: (1) direct convolution, (2) recursive algorithm by De Pril, (3) approximation. • We provide associated MATLAB codes (including context-specific customizations) for direct implementation of these methods and discuss technical aspects and essential details of the tweaks we have made. • A representative application example in SPM shows that the proposed approximation is considerably simpler in application than both other methods and it ensures a remarkable high accuracy of the results while reducing computational time considerably. Elsevier 2021-09-06 /pmc/articles/PMC8563477/ /pubmed/34754778 http://dx.doi.org/10.1016/j.mex.2021.101507 Text en © 2021 The Author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle	Method Article Johannssen, Arne Chukhrova, Nataliya Castagliola, Philippe Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title	Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title_full	Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title_fullStr	Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title_full_unstemmed	Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title_short	Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
title_sort	efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables
topic	Method Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8563477/ https://www.ncbi.nlm.nih.gov/pubmed/34754778 http://dx.doi.org/10.1016/j.mex.2021.101507
work_keys_str_mv	AT johannssenarne efficientalgorithmsforcalculatingtheprobabilitydistributionofthesumofhypergeometricdistributedrandomvariables AT chukhrovanataliya efficientalgorithmsforcalculatingtheprobabilitydistributionofthesumofhypergeometricdistributedrandomvariables AT castagliolaphilippe efficientalgorithmsforcalculatingtheprobabilitydistributionofthesumofhypergeometricdistributedrandomvariables

Efficient algorithms for calculating the probability distribution of the sum of hypergeometric-distributed random variables

Ejemplares similares