Yule–Simon distribution
| Probability mass function 
 | |
| Cumulative distribution function 
 | |
| Parameters |  shape (real) | 
|---|---|
| Support |  | 
| pmf |  | 
| CDF |  | 
| Mean |  for  | 
| Mode |  | 
| Variance |  for  | 
| Skewness |  for  | 
| Ex. kurtosis |  for  | 
| MGF |  | 
| CF |  | 
In probability and statistics, the Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon. Simon originally called it the Yule distribution.[1]
The probability mass function (pmf) of the Yule–Simon (ρ) distribution is
 , ,
for integer  and real
 and real  , where
, where  is the beta function.  Equivalently the pmf can be written in terms of the falling factorial as
 is the beta function.  Equivalently the pmf can be written in terms of the falling factorial as
 , ,
where  is the gamma function.  Thus, if
 is the gamma function.  Thus, if  is an integer,
 is an integer,
 . .
The parameter  can be estimated using a fixed point algorithm.[2]
 can be estimated using a fixed point algorithm.[2]
The probability mass function f has the property that for sufficiently large k we have
 . .

This means that the tail of the Yule–Simon distribution is a realization of Zipf's law:  can be used to model, for example, the relative frequency of the
 can be used to model, for example, the relative frequency of the  th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of
th most frequent word in a large collection of text, which according to Zipf's law is inversely proportional to a (typically small) power of  .
.
Occurrence
The Yule–Simon distribution arose originally as the limiting distribution of a particular stochastic process studied by Yule as a model for the distribution of biological taxa and subtaxa.[3] Simon dubbed this process the "Yule process" but it is more commonly known today as a preferential attachment process. The preferential attachment process is an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number the urn already contains.
The distribution also arises as a compound distribution, in which the parameter of a geometric distribution is treated as a function of random variable having an exponential distribution.  Specifically, assume that  follows an exponential distribution with scale
 follows an exponential distribution with scale  or rate
 or rate  :
:
 , ,
with density
 . .
Then a Yule–Simon distributed variable K has the following geometric distribution conditional on W:
The pmf of a geometric distribution is
for  .  The Yule–Simon pmf is then the following exponential-geometric compound distribution:
.  The Yule–Simon pmf is then the following exponential-geometric compound distribution:
 . .
The following recurrence relation holds:
Generalizations
The two-parameter generalization of the original Yule distribution replaces the beta function with an incomplete beta function. The probability mass function of the generalized Yule–Simon(ρ, α) distribution is defined as
with  .  For
.  For  the ordinary Yule–Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
 the ordinary Yule–Simon(ρ) distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.
Bibliography
- Colin Rose and Murray D. Smith, Mathematical Statistics with Mathematica. New York: Springer, 2002, ISBN 0-387-95234-9. (See page 107, where it is called the "Yule distribution".)
References
- ↑ Simon, H. A. (1955). "On a class of skew distribution functions". Biometrika 42 (3–4): 425–440. doi:10.1093/biomet/42.3-4.425.
- ↑ Garcia Garcia, Juan Manuel (2011). "A fixed-point algorithm to estimate the Yule-Simon distribution parameter". Applied Mathematics and Computation 217 (21): 8560–8566. doi:10.1016/j.amc.2011.03.092.
- ↑ Yule, G. U. (1925). "A Mathematical Theory of Evolution, based on the Conclusions of Dr. J. C. Willis, F.R.S". Philosophical Transactions of the Royal Society B 213 (402–410): 21–87. doi:10.1098/rstb.1925.0002.




![\left\{\begin{array}{l}
k P(k)=(\alpha +k+1) P(k+1), \\[10pt]
P(1)=\alpha  B(\alpha +1,1)
\end{array}\right\}](../I/m/6fd3d3705cace51ba66e7755500db5e1.png)
