Van Houtum distribution

Van Houtum distribution
Probability mass function
Parameters	$p_a,p_b \in [0,1] \text{ and } a,b \in \mathbb{Z} \text{ with } a\leq b$
Support	$k \in \{a,a+1,\dots,b-1,b\}\,$
pmf	$\begin{cases} p_a & \text{if } u=a; \\ p_b & \text{if } u=b \\ \frac{1-p_a-p_b}{b-a-1} & \text{if } a<u<b \\ 0 & \text{otherwise} \end{cases}$
CDF	$\begin{cases} 0 & \textrm{if } u<a; \\ p_a & \text{if } u=a \\ p_a+\lfloor x-a \rfloor \frac{1-p_a-p_b}{b-a-1} & \text{if } a<u<b \\ 1 & \text{if } u \geq b \end{cases}$
Mean	$ap_a+bp_b+(1-p_a-p_b)\frac{a+b}{2}$
Mode	N/A
Variance	$\ a^2p_a+b^2p_b - {} \$ $\frac{(a+b)(1-p_a-p_b)+2ap_a+2bp_b}{4}$ ${} + \frac{b(2b-1)(b-1)-a(2a+1)(a+1)}{6}$
Entropy	$\ -p_a \ln(p_a)-p_b\ln(p_b)- {} \$ $(1-p_a-p_b)\ln\left(\frac{1-p_a-p_b}{b-a-1}\right)$
MGF	$e^{ta}p_a+e^tbp_b+\frac{1-p_a-p_b}{b-a-1}\frac{e^{(a+1)t}-e^{bt}}{e^t-1}$
CF	$e^{ita}p_a+e^{itb}p_b+\frac{1-p_a-p_b}{b-a-1}\frac{e^{(a+1)it}-e^{bit}}{e^{it}-1}$

In probability theory and statistics, the Van Houtum distribution is a discrete probability distribution named after prof. Geert-Jan van Houtum.^[1] It can be characterized by saying that all values of a finite set of possible values are equally probable, except for the smallest and largest element of this set. Since the Van Houtum distribution is a generalization of the discrete uniform distribution, i.e. it is uniform except possibly at its boundaries, it is sometimes also referred to as quasi-uniform.

It is regularly the case that the only available information concerning some discrete random variable are its first two moments. The Van Houtum distribution can be used to fit a distribution with finite support on these moments.

A simple example of the Van Houtum distribution arises when throwing a loaded dice which has been tampered with to land on a 6 twice as often as on a 1. The possible values of the sample space are 1, 2, 3, 4, 5 and 6. Each time the die is thrown, the probability of throwing a 2, 3, 4 or 5 is 1/6; the probability of a 1 is 1/9 and the probability of throwing a 6 is 2/9.

Probability mass function

A random variable U has a Van Houtum (a, b, p_a, p_b) distribution if its probability mass function is

\Pr(U=u) = \begin{cases} p_a & \text{if } u=a; \\[8pt] p_b & \text{if } u=b \\[8pt] \dfrac{1-p_a-p_b}{b-a-1} & \text{if } a<u<b \\[8pt] 0 & \text{otherwise} \end{cases}

Fitting procedure

Suppose a random variable $X$ has mean $\mu$ and squared coefficient of variation $c^2$ . Let $U$ be a Van Houtum distributed random variable. Then the first two moments of $U$ match the first two moments of $X$ if $a$ , $b$ , $p_a$ and $p_b$ are chosen such that:^[2]

\begin{align} a &= \left\lceil \mu - \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rceil \\[8pt] b &= \left\lfloor \mu + \frac{1}{2} \left\lceil \sqrt{1+12c^2\mu^2} \right\rceil \right\rfloor \\[8pt] p_b &= \frac{(c^2+1)\mu^2-A-(a^2-A)(2\mu-a-b)/(a-b)}{a^2+b^2-2A} \\[8pt] p_a &= \frac{2\mu-a-b}{a-b}+p_b \\[12pt] \text{where } A & = \frac{2a^2+a+2ab-b+2b^2}{6}. \end{align}

There does not exist a Van Houtum distribution for every combination of $\mu$ and $c^2$ . By using the fact that for any real mean $\mu$ the discrete distribution on the integers that has minimal variance is concentrated on the integers $\lfloor \mu \rfloor$ and $\lceil \mu \rceil$ , it is easy to verify that a Van Houtum distribution (or indeed any discrete distribution on the integers) can only be fitted on the first two moments if ^[3]

c^2\mu^2 \geq (\mu-\lfloor \mu \rfloor)(1+\mu-\lceil \mu \rceil)^2+(\mu-\lfloor \mu \rfloor)^2(1+\mu-\lceil \mu \rceil).

References

↑ A. Saura (2012), Van Houtumin jakauma (in Finnish). BSc Thesis, University of Helsinki, Finland
↑ J.J. Arts (2009), Efficient optimization of the Dual-Index policy using Markov Chain approximations. MSc Thesis, Eindhoven University of Technology, The Netherlands (Appendix B)
↑ I.J.B.F. Adan, M.J.A. van Eenige, and J.A.C. Resing. "Fitting discrete distributions on the first two moments". Probability in the Engineering and Informational Sciences, 9:623-632, 1996.

Probability distributions

Discrete univariate with finite support

Benford Bernoulli Beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher discrete uniform Zipf Zipf–Mandelbrot

Discrete univariate with infinite support

beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta

Continuous univariate supported on a bounded interval, e.g. [0,1]

Arcsine ARGUS Balding–Nichols Bates Beta Beta rectangular Irwin–Hall Kumaraswamy logit-normal Noncentral beta raised cosine Reciprocal Triangular U-quadratic uniform Wigner semicircle

[[List of probability distributions#Supported_on_semi-infinite_intervals.2C_usually_.5B0.2C.E2.88.9E.29|Continuous univariate supported on a semi-infinite interval, usually [0,∞)]]

Continuous univariate supported on the whole real line (−∞, ∞)

Cauchy exponential power Fisher's z generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson S_U Landau Laplace Asymmetric Laplace Linnik logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt

Continuous univariate with support whose type varies

generalized extreme value generalized Pareto Tukey lambda q-Gaussian q-exponential q-Weibull shifted log-logistic

Mixed continuous-discrete univariate distributions

rectified Gaussian

Multivariate (joint)

Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet Generalized Dirichlet multivariate normal Multivariate stable multivariate Student normal-scaled inverse gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart

Directional

Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham

Degenerate and singular

Degenerate discrete degenerate Dirac delta function Singular Cantor

Families

Circular compound Poisson elliptical exponential natural exponential location-scale maximum entropy mixture Pearson Tweedie wrapped

This article is issued from Wikipedia - version of the Monday, January 12, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Van Houtum distribution

Probability mass function

Fitting procedure

See also

References