Generalized beta distribution

In probability and statistics, the generalized beta distribution[1] is a continuous probability distribution with five parameters, including more than thirty named distributions as limiting or special cases. It has been used in the modeling of income distribution, stock returns, as well as in regression analysis. The exponential generalized Beta (EGB) distribution follows directly from the GB and generalizes other common distributions.

Definition

A generalized beta random variable, Y, is defined by the following probability density function:

 GB(y;a,b,c,p,q) = \frac{|a|y^{ap-1}(1-(1-c)(y/b)^{a})^{q-1}}{b^{ap}B(p,q)(1+c(y/b)^{a})^{p+q}} \quad \quad \text{  for } 0<y^{a}< \frac{b^a}{1-c} ,

and zero otherwise. Here the parameters satisfy  0 \le c \le 1 and  b ,  p , and  q positive. The function B(p,q) is the beta function.

GB distribution tree

Properties

Moments

It can be shown that the hth moment can be expressed as follows:

 \operatorname{E}_{GB}(Y^{h})=\frac{b^{h}B(p+h/a,q)}{B(p,q)}{}_{2}F_{1} \begin{bmatrix}
 p + h/a,h/a;c \\
 p + q +h/a;
\end{bmatrix},

where {}_{2}F_{1} denotes the hypergeometric series (which converges for all h if c<1, or for all h/a<q if c=1 ).

Related distributions

The generalized beta encompasses a number of distributions in its family as special cases. Listed below are its three direct descendants, or sub-families.

Generalized beta of first kind (GB1)

The generalized beta of the first kind is defined by the following pdf:

 GB1(y;a,b,p,q) = \frac{|a|y^{ap-1}(1-(y/b)^{a})^{q-1}}{b^{ap}B(p,q)}

for  0< y^{a}<b^{a} where  b ,  p , and  q are positive. It is easily verified that

 GB1(y;a,b,p,q) = GB(y;a,b,c=0,p,q).

The moments of the GB1 are given by

 \operatorname{E}_{GB1}(Y^{h}) = \frac{b^{h}B(p+h/a,q)}{B(p,q)}.

The GB1 includes the beta of the first kind (B1), generalized gamma(GG), and Pareto as special cases:

 B1(y;b,p,q) = GB1(y;a=1,b,p,q) ,
 GG(y;a,\beta,p) = \lim_{q \to \infty}
GB1(y;a,b=q^{1/a}\beta,p,q) ,
 PARETO(y;b,p) = GB1(y;a=-1,b,p,q=1) .

Generalized beta of the second kind (GB2)

The GB2 (also known as the Generalized Beta Prime) is defined by the following pdf:

 GB2(y;a,b,p,q) = \frac{|a|y^{ap-1}}{b^{ap}B(p,q)(1+(y/b)^a)^{p+q}}

for  0< y < \infty and zero otherwise. One can verify that

 GB2(y;a,b,p,q) = GB(y;a,b,c=1,p,q).

The moments of the GB2 are given by

 \operatorname{E}_{GB2}(Y^h) = \frac{b^h B(p+h/a,q-h/a)}{B(p,q)}.

The GB2 nests common distributions such as the generalized gamma (GG), Burr type 3, Burr type 12, Dagum, lognormal, Weibull, gamma, Lomax, F statistic, Fisk or Rayleigh, chi-square, half-normal, half-Student's t, exponential, and the log-logistic.[2]

Beta

The beta distribution (B) is defined by:[1]

 B(y;b,c,p,q) = \frac{y^{p-1}(1-(1-c)(y/b))^{q-1}}{b^{p}B(p,q)(1+c(y/b))^{p+q}}

for  0<y<b/(1-c) and zero otherwise. Its relation to the GB is seen below:

 B(y;b,c,p,q) = GB(y;a=1,b,c,p,q).

The beta family includes the betas of the first and second kind[3] (B1 and B2, where the B2 is also referred to as the Beta prime), which correspond to c = 0 and c = 1, respectively.

Generalized Gamma

The generalized gamma distribution (GG) is a limiting case of the GB2. The PDF is defined by:[4]

 GG(y;a,\beta,p) = \lim_{q \rightarrow \infty} GB2(y,a,b=q^{1/a} \beta,p,q) = \frac{|a|y^{ap-1}e^{-(y/\beta)^{a}}}{\beta^{a} \Gamma (p)}

with the hth moments given by

 \operatorname{E}(Y_{GG}^h) = \frac{\beta^h \Gamma (p + h/a)}{\Gamma (p)}.

A figure showing the relationship between the GB and its special and limiting cases is included above (see McDonald and Xu (1995) ).

Exponential generalized beta distribution

Letting  Y \sim GB(y;a,b,c,p,q) , the random variable  Z = \ln(Y) , with re-parametrization, is distributed as an exponential generalized beta (EGB), with the following pdf:

 EGB(z;\delta,\sigma,c,p,q) = \frac{e^{p(z-\delta)/\sigma}(1-(1-c)e^{(z-\delta)/\sigma})^{q-1}}{|\sigma|B(p,q)(1+ce^{(z-\delta)/\sigma})^{p+q}}

for  -\infty < \frac{z-\delta}{\sigma}<\ln(\frac{1}{1-c}) , and zero otherwise. The EGB includes generalizations of the Gompertz, Gumbell, extreme value type I, logistic, Burr-2, exponential, and normal distributions.

Included is a figure showing the relationship between the EGB and its special and limiting cases.[5]

The EGB family of distributions

Moment generating function

Using similar notation as above, the moment-generating function of the EGB can be expressed as follows:

 M_{EGB}(Z)=\frac{e^{\delta t}B(p+t\sigma,q)}{B(p,q)}{}_{2}F_{1} \begin{bmatrix}
 p + t\sigma,t\sigma;c \\
 p + q +t\sigma;
\end{bmatrix}.

Applications

The flexibility provided by the GB family is used in modeling the distribution of:

Applications involving members of the EGB family include:[1][2]

Distribution of Income

The GB2 and several of its special and limiting cases have been widely used as models for the distribution of income. For some early examples see Thurow (1970),[6] Dagum (1977),[7] Singh and Maddala (1976),[8] and McDonald (1984).[2] Maximum likelihood estimations using individual, grouped, or top-coded data are easily performed with these distributions.

Measures of inequality, such as the Gini index (G), Pietra index (P), and Theil index (T) can be expressed in terms of the distributional parameters, as given by McDonald and Ransom (2008):[9]

\begin{align} G=\left({\frac{1}{2\mu}}\right) \operatorname{E}(|Y-X|) = \left(P{\frac{1}{2\mu}}\right) \int_{0}^{\infty}\int_{0}^{\infty} |x-y|f(x)f(y)\,dx dy \\
= 1 - \frac{\int_{0}^{\infty} (1-F(y))^2\,dy}{\int_{0}^{\infty} (1-F(y))\,dy} \\
P = \left( \frac{1}{2\mu}\right) \operatorname{E} (|Y-\mu|) = \left(\frac{1}{2\mu}\right)\int_0^{\infty} |y-\mu|f(y)\, dy \\
T = \operatorname{E} (\ln (Y/\mu)^{Y/\mu}) = \int_0^ \infty (y/\mu) \ln (y/\mu) f(y)\, dy
\end{align}

Hazard Functions

The hazard function, h(s), where f(s) is a pdf and F(s) the corresponding cdf, is defined by

 h(s) = \frac{f(s)}{1-F(s)}

Hazard functions are useful in many applications, such as modeling unemployment duration, the failure time of products or life expectancy. Taking a specific example, if s denotes the length of life, then h(s) is the rate of death at age s, given that an individual has lived up to age s. The shape of the hazard function for mortality data might appear as follows: The hazard function h(s) exhibits decreasing mortality in the first few months of life, then a period of relatively constant mortality and finally an increasing probability of death at older ages.

Special cases of the generalized beta distribution offer more flexibility in modeling the shape of the hazard function, which can call for "∪" or "∩" shapes or strictly increasing (denoted by I) or decreasing (denoted by D) lines. The generalized gamma is "∪"-shaped for a>1 and p<1/a, "∩"-shaped for a<1 and p>1/a, I-shaped for a>1 and p>1/a and D-shaped for a<1 and p>1/a.[10] This is summarized in the figure below.[11][12]

Possible hazard function shapes using the generalized gamma

References

  1. 1 2 3 McDonald, James B. & Xu, Yexiao J. (1995) "A generalization of the beta distribution with applications," Journal of Econometrics, 66(1–2), 133–152 doi:10.1016/0304-4076(94)01612-4
  2. 1 2 3 McDonald, J.B. (1984) "Some generalized functions for the size distributions of income", Econometrica 52, 647–663.
  3. Stuart, A. and Ord, J.K. (1987): Kendall's Advanced Theory of Statistics, New York: Oxford University Press.
  4. Stacy, E.W. (1962). "A Generalization of the Gamma Distribution." Annals of Mathematical Statistics 33(3): 1187-1192. JSTOR 2237889
  5. McDonald, James B. & Kerman, Sean C. (2013) "Skewness-Kurtosis Bounds for EGB1, EGB2, and Special Cases," Forthcoming
  6. Thurow, L.C. (1970) "Analyzing the American Income Distribution," Papers and Proceedings, American Economics Association, 60, 261-269
  7. Dagum, C. (1977) "A New Model for Personal Income Distribution: Specification and Estimation," Economie Applique'e, 30, 413-437
  8. Singh, S.K. and Maddala, G.S (1976) "A Function for the Size Distribution of Incomes," Econometrica, 44, 963-970
  9. McDonald, J.B. and Ransom, M. (2008) "The Generalized Beta Distribution as a Model for the Distribution of Income: Estimation of Related Measures of Inequality", Modeling the Distributions and Lorenz Curves, "Economic Studies in Inequality: Social Exclusion and Well-Being", Springer: New York editor Jacques Silber, 5, 147-166
  10. Glaser, Ronald E. (1980) "Bathtub and Related Failure Rate Characterizations," Journal of the American Statistical Association, 75(371), 667-672 doi:10.1080/01621459.1980.10477530
  11. McDonald, James B. (1987) "A general methodology for determining distributional forms with applications in reliability," Journal of Statistical Planning and Inference, 16, 365-376 doi:10.1016/0378-3758(87)90089-9
  12. McDonald, J.B. and Richards, D.O. (1987) "Hazard Functions and Generalized Beta Distributions", IEEE Transactions on Reliability, 36, 463-466

Bibliography

This article is issued from Wikipedia - version of the Thursday, May 05, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.