Folded normal distribution

Probability density function $μ =1, σ =1$
Cumulative distribution function $μ =1, σ =1$
Parameters	$μ \in R$ (location) $σ 2 > 0$ (scale)
Support	$x \in [0,\infty)$
PDF	$\frac{1}{\sigma\sqrt{2\pi}} \, e^{ -\frac{(x-\mu)^2}{2\sigma^2} } + \frac{1}{\sigma\sqrt{2\pi}} \, e^{ -\frac{(x+\mu)^2}{2\sigma^2} }$
CDF	$\frac{1}{2}\left[ \mbox{erf}\left(\frac{x+\mu}{\sigma\sqrt{2}}\right) + \mbox{erf}\left(\frac{x-\mu}{\sigma\sqrt{2}}\right)\right]$
Mean	$\mu_Y = \sigma \sqrt{\tfrac{2}{\pi}} \, e^{(-\mu^2/2\sigma^2)} + \mu \left(1 - 2\,\Phi(\tfrac{-\mu}{\sigma}) \right)$
Variance	$\sigma_Y^2 = \mu^2 + \sigma^2 - \mu_Y^2$

The folded normal distribution is a probability distribution related to the normal distribution. Given a normally distributed random variable X with mean μ and variance σ², the random variable Y = |X| has a folded normal distribution. Such a case may be encountered if only the magnitude of some variable is recorded, but not its sign. The distribution is called Folded because probability mass to the left of the x = 0 is "folded" over by taking the absolute value. In the physics of heat conduction, the folded normal distribution is a fundamental solution of the heat equation on the upper plane (i.e. a heat kernel).

The probability density function (PDF) is given by

f_Y(x;\mu,\sigma^2)= \frac{1}{\sqrt{2\pi\sigma^2}} \, e^{ -\frac{(x-\mu)^2}{2\sigma^2} } + \frac{1}{\sqrt{2\pi\sigma^2}} \, e^{ -\frac{(x+\mu)^2}{2\sigma^2} }

for x≥0, and 0 everywhere else. An alternative formulation is given by

$f\left(x \right)=\sqrt{\frac{2}{\pi\sigma^2}}e^{-\frac{\left(x^2+\mu^2 \right)}{2\sigma^2}}\cosh{\left(\frac{\mu x}{\sigma^2}\right)}$ ,

where cosh is the cosine Hyperbolic function. It follows that the cumulative distribution function (CDF) is given by:

F_Y(x; \mu, \sigma^2) = \frac{1}{2}\left[ \mbox{erf}\left(\frac{x+\mu}{\sqrt{2\sigma^2}}\right) + \mbox{erf}\left(\frac{x-\mu}{\sqrt{2\sigma^2}}\right)\right]

for x≥0, where erf() is the error function. This expression reduces to the CDF of the half-normal distribution when μ = 0.

The mean of the folded distribution is then

\mu_Y = \sigma \sqrt{\frac{2}{\pi}} \,\, \exp\left(\frac{-\mu^2}{2\sigma^2}\right) - \mu \, \mbox{erf}\left(\frac{-\mu}{\sqrt{2\sigma^2}}\right)

\mu_Y = \sqrt{\frac{2}{\pi}}\sigma e^{-\frac{\mu^2}{2\sigma^2}}+\mu\left[1-2\Phi\left(-\frac{\mu}{\sigma}\right) \right]

where $\Phi$ is the normal cumulative distribution function:

\Phi(x)\; =\; \frac12\left[1 + \operatorname{erf}\left(\frac{x}{\sqrt{2}}\right)\right].

The variance then is expressed easily in terms of the mean:

\sigma_Y^2 = \mu^2 + \sigma^2 - \mu_Y^2.

Both the mean (μ) and variance (σ²) of X in the original normal distribution can be interpreted as the location and scale parameters of Y in the folded distribution.

Mode of the distribution

The mode of the distribution is the value of $x$ for which the density is maximised. In order to find this value, we take the first derivative of the density with respect to $x$ and set it equal to zero. Unfortunately, there is no closed form. We can, however, write the derivative in a better way and end up with a non-linear equation

$\frac{df(x)}{dx}=0 \Rightarrow -\frac{\left(x-\mu\right)}{\sigma^2}e^{-\frac{1}{2}\frac{\left(x-\mu\right)^2}{\sigma^2}}- \frac{\left(x+\mu\right)}{\sigma^2}e^{-\frac{1}{2}\frac{\left(x+\mu\right)^2}{\sigma^2}}=0$

$x\left[e^{-\frac{1}{2}\frac{\left(x-\mu\right)^2}{\sigma^2}}+e^{-\frac{1}{2}\frac{\left(x+\mu\right)^2}{\sigma^2}}\right]- \mu \left[e^{-\frac{1}{2}\frac{\left(x-\mu\right)^2}{\sigma^2}}-e^{-\frac{1}{2}\frac{\left(x+\mu\right)^2}{\sigma^2}}\right]=0$

$x\left(1+e^{-\frac{2\mu x}{\sigma^2}}\right)-\mu\left(1-e^{-\frac{2\mu x}{\sigma^2}}\right)=0$

$\left(\mu+x\right)e^{-\frac{2\mu x}{\sigma^2}}=\mu-x$

$x=-\frac{\sigma^2}{2\mu}\log{\frac{\mu-x}{\mu+x}}$ .

Tsagris et al. (2014) saw from numerical investigation that when $\mu<\sigma$ , the maximum is met when $x=0$ , and when $\mu$ becomes greater than $3\sigma$ , the maximum approaches $\mu$ . This is of course something to be expected, since, in this case, the folded normal converges to the normal distribution. In order to avoid any trouble with negative variances, the exponentiation of the parameter is suggested. Alternatively, you can add a constraint, such as if the optimiser goes for a negative variance the value of the log-likelihood is NA or something very small.

Characteristic function and other related functions

The characteristic function is given by

$\varphi_x\left(t\right)=e^{\frac{-\sigma^2 t^2}{2}+i\mu t}\left[1-\Phi\left(-\frac{\mu}{\sigma}+i\sigma t \right) \right]+ e^{-\frac{\sigma^2 t^2}{2}-i\mu t}\left[1-\Phi\left(\frac{\mu}{\sigma}+i\sigma t \right) \right]$ .

The moment generating function is given by

$M_x\left(t\right)=\varphi_x\left(-it\right)=e^{\frac{\sigma^2 t^2}{2}+\mu t}\left[1-\Phi\left(-\frac{\mu}{\sigma}-\sigma t \right) \right]+ e^{\frac{\sigma^2 t^2}{2}-\mu t}\left[1-\Phi\left(\frac{\mu}{\sigma}-\sigma t \right) \right]$ .

The cumulant generating function is given by

$K_x\left(t\right)=\log{M_x\left(t\right)}= \left(\frac{\sigma^2t^2}{2}+\mu t\right) + \log{\left\lbrace 1-\Phi\left(-\frac{\mu}{\sigma}-\sigma t \right) + e^{-2\mu t}\left[1-\Phi\left(\frac{\mu}{\sigma}-\sigma t \right) \right] \right\rbrace}$ .

The Laplace transformation is given by

$E\left(e^{-tx}\right)=e^{\frac{\sigma^2t^2}{2}-\mu t}\left[1-\Phi\left(-\frac{\mu}{\sigma}+\sigma t \right) \right]+ e^{\frac{\sigma^2 t^2}{2}+\mu t}\left[1-\Phi\left(\frac{\mu}{\sigma}+\sigma t \right) \right]$ .

The Fourier transform is given by

$\hat{f}\left(t\right)=\phi_x\left(-2\pi t\right)= e^{\frac{-4\pi^2\sigma^2 t^2}{2}- i2\pi \mu t}\left[1-\Phi\left(-\frac{\mu}{\sigma}-i2\pi \sigma t \right) \right]+ e^{-\frac{4\pi^2 \sigma^2 t^2}{2}+i2\pi\mu t}\left[1-\Phi\left(\frac{\mu}{\sigma}-i2\pi \sigma t \right) \right]$ .

Parameter estimation

There are a few ways of estimating the parameters of the folded normal are presented. All of them are essentially the maximum likelihood estimation procedure, but in the some cases, a numerical maximization is performed, whereas in other cases, the root of an equation is being searched. The log-likelihood of the folded normal when a sample $x_i$ of size $n$ is available can be written in the following way

$l = -\frac{n}{2}\log{2\pi\sigma^2}+\sum_{i=1}^n\log{\left[e^{-\frac{\left(x_i-\mu\right)^2}{2\sigma^2}}+ e^{-\frac{\left(x_i+\mu\right)^2}{2\sigma^2}} \right] }$

$l = -\frac{n}{2}\log{2\pi\sigma^2}+\sum_{i=1}^n\log{\left[e^{-\frac{\left(x_i-\mu\right)^2}{2\sigma^2}} \left(1+e^{-\frac{\left(x_i+\mu\right)^2}{2\sigma^2}}e^{\frac{\left(x_i-\mu\right)^2}{2\sigma^2}}\right)\right]}$

$l = -\frac{n}{2}\log{2\pi\sigma^2}-\sum_{i=1}^n\frac{\left(x_i-\mu\right)^2}{2\sigma^2}+\sum_{i=1}^n\log{\left(1+e^{-\frac{2\mu x_i}{\sigma^2}} \right)}$

In R (programming language) the command optim or nlm will do the job. The maximisation is fast and easy, since two parameters ( $\mu$ and $\sigma^2$ ) are involved. Note, that both positive and negative values for $\mu$ are acceptable, since $\mu$ belongs to the real line of numbers, hence, the sign is not important because the distribution is symmetric with respect to it. The next code is written in R

folded <- function(y) {
  ## y is a vector with positive data 
  n <- length(y)  ## sample size
  sam <- function(para) {
    me <- para[1]   ;   se <- para[2]
    if (se < 0) {
      f <- 100000
    } else {
      f <-  - n/2 * log(2/pi) + n/2 * log(se) + n * me^2 / (2 * se) + 
      sum(y^2) / (2 * se) - sum( log( cosh( (me * y)/se ) ) )  
    }
    f 
  }
  mod <- optim( c( mean(y), sd(y) ), sam, control = list(maxit = 2000) )
  mod <- optim( mod$par, sam, control = list(maxit = 20000) )
  result <- c(-mod$value, mod$par)
  names(result) <- c("log-likelihood", "mu", "sigma squared")
  result
}

The partial derivatives of the log-likelihood are written as

$\frac{\partial l}{\partial \mu} = \frac{\sum_{i=1}^n\left(x_i-\mu \right)}{\sigma^2}- \frac{2}{\sigma^2}\sum_{i=1}^n\frac{x_ie^{\frac{-2\mu x_i}{\sigma^2}}}{1+e^{\frac{-2\mu x_i}{\sigma^2}}}$

$\frac{\partial l}{\partial \mu} = \frac{\sum_{i=1}^n\left(x_i-\mu \right)}{\sigma^2}-\frac{2}{\sigma^2}\sum_{i=1}^n\frac{x_i}{1+e^{\frac{2\mu x_i}{\sigma^2}}} \ \ \text{and}$

$\frac{\partial l}{\partial \sigma^2} = -\frac{n}{2\sigma^2}+\frac{\sum_{i=1}^n\left(x_i-\mu \right)^2}{2\sigma^4}+ \frac{2\mu}{\sigma^4}\sum_{i=1}^n\frac{x_ie^{-\frac{2\mu x_i}{\sigma^2}}}{1+e^{-\frac{2\mu x_i}{\sigma^2}}}$

$\frac{\partial l}{\partial \sigma^2} = -\frac{n}{2\sigma^2}+\frac{\sum_{i=1}^n\left(x_i-\mu \right)^2}{2\sigma^4}+ \frac{2\mu}{\sigma^4}\sum_{i=1}^n\frac{x_i}{1+e^{\frac{2\mu x_i}{\sigma^2}}}$ .

By equating the first partial derivative of the log-likelihood to zero, we obtain a nice relationship

$\sum_{i=1}^n\frac{x_i}{1+e^{\frac{2\mu x_i}{\sigma^2}}}=\frac{\sum_{i=1}^n\left(x_i-\mu \right)}{2}$ .

Note that the above equation has three solutions, one at zero and two more with the opposite sign. By substituting the above equation, to the partial derivative of the log-likelihood w.r.t $\sigma^2$ and equating it to zero, we get the following expression for the variance

$\sigma^2=\frac{\sum_{i=1}^n\left(x_i-\mu\right)^2}{n}+\frac{2\mu\sum_{i=1}^n\left(x_i-\mu\right)}{n}=\frac{\sum_{i=1}^n\left(x_i^2-\mu^2\right)}{n}=\frac{\sum_{i=1}^nx_i^2}{n}-\mu^2$ ,

which is the same formula as in the normal distribution. A main difference here is that $\mu$ and $\sigma^2$ are not statistically independent. The above relationships can be used to obtain maximum likelihood estimates in an efficient recursive way. We start with an initial value for $\sigma^2$ and find the positive root ( $\mu$ ) of the last equation. Then, we get an updated value of $\sigma^2$ . The procedure is being repeated until the change in the log-likelihood value is negligible. Another easier and more efficient way is to perform a search algorithm. Let us write the last equation in a more elegant way

$2\sum_{i=1}^n\frac{x_i}{1+e^{\frac{2\mu x_i}{\sigma^2}}}- \sum_{i=1}^n\frac{x_i\left(1+e^{\frac{2\mu x_i}{\sigma^2}}\right)}{1+e^{\frac{2\mu x_i}{\sigma^2}}}+n\mu = 0$

$\sum_{i=1}^n\frac{x_i\left(1-e^{\frac{2\mu x_i}{\sigma^2}}\right)}{1+e^{\frac{2\mu x_i}{\sigma^2}}}+n\mu = 0$ .

It becomes clear that the optimization the log-likelihood with respect to the two parameters has turned into a root search of a function. This of course is identical to the previous root search. Tsagris et al. (2014) spotted that there are three roots to this equation for $\mu$ , i.e. there are three possible values of $\mu$ that satisfy this equation. The $-\mu$ and $+\mu$ , which are the maximum likelihood estimates and 0, which corresponds to the minimum log-likelihood.

Differential equations

The PDF of the folded normal distribution can also be defined by the system of differential equations

\begin{cases} \sigma^4 f''(x) + 2\sigma^2 x f'(x) + \left(-\mu ^2+\sigma^2+x^2\right) f(x) = 0 \\ f(0) = \sqrt{\frac{2}{\pi\sigma^2}} \, e^{-\frac{\mu^2}{2\sigma^2}} \\ f'(0) = 0 \end{cases}

Related distributions

When $μ = 0$ , the distribution of $Y$ is a half-normal distribution.
The random variable $(Y / σ) 2$ has a noncentral chi-squared distribution with 1 degree of freedom and noncentrality equal to $(μ / σ) 2$ .
The folded normal distribution can also be seen as the limit of the folded non-standardized t distribution as the degrees of freedom go to infinity. The folded non-standardized t distribution is the distribution of the absolute value of the non-standardized t distribution with v degrees of freedom

$g\left(x\right)=\frac{\Gamma\left(\frac{v+1}{2}\right)}{\Gamma\left(\frac{v}{2}\right)\sqrt{v\pi\sigma^2}}\left\lbrace \left[1+\frac{1}{v}\frac{\left(x-\mu\right)^2}{\sigma^2}\right]^{-\frac{v+1}{2}}+\left[1+\frac{1}{v}\frac{\left(x+\mu\right)^2}{\sigma^2}\right]^{-\frac{v+1}{2}} \right\rbrace$ .

There is a bivariate version developed by Psarakis and Panaretos (2001) as well as a multivariate version developed by Chakraborty and Moutushi (2013).

External links

Virtual Laboratories: The Folded Normal Distribution

References

Tsagris M., Beneki C. and Hassani H. (2014) "On the folded normal distribution." Mathematics 2 (1): 12-28.
Leone FC, Nottingham RB, Nelson LS (1961). "The Folded Normal Distribution". Technometrics (Technometrics, Vol. 3, No. 4) 3 (4): 543–550. doi:10.2307/1266560. JSTOR 1266560.
Johnson NL (1962). "The folded normal distribution: accuracy of the estimation by maximum likelihood". Technometrics (Technometrics, Vol. 4, No. 2) 4 (2): 249–256. doi:10.2307/1266622. JSTOR 1266622.
Nelson LS (1980). "The Folded Normal Distribution". J Qual Technol 12 (4): 236–238.
Elandt RC (1961). "The folded normal distribution: two methods of estimating parameters from moments". Technometrics (Technometrics, Vol. 3, No. 4) 3 (4): 551–562. doi:10.2307/1266561. JSTOR 1266561.
Lin PC (2005). "Application of the generalized folded-normal distribution to the process capability measures". Int J Adv Manuf Technol 26 (7–8): 825–830. doi:10.1007/s00170-003-2043-x.
Psarakis S. and Panaretos J. (1990). "The folded t distribution. "Communications in Statistics-Theory and Methods 19 (7): 2717-2734.
Psarakis S. and Panaretos J. (2001). "On some bivariate extensions of the folded normal and the folded-t distributions." Journal of Applied Statistical Science 10 (2): 119-136.
Chakraborty A. K. and Moutushi C. (2013). "On multivariate folded normal distribution." Sankhya B 75 (1): 1-15.

Probability distributions

List of probability distributions

Discrete univariate with finite support	Benford Bernoulli Beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher discrete uniform Zipf Zipf–Mandelbrot

Discrete univariate with infinite support	beta negative binomial Borel Conway–Maxwell–Poisson discrete phase-type Delaporte extended negative binomial Gauss–Kuzmin geometric logarithmic negative binomial parabolic fractal Poisson Skellam Yule–Simon zeta

Continuous univariate supported on a bounded interval	Arcsine ARGUS Balding–Nichols Bates Beta Beta rectangular Irwin–Hall Kumaraswamy logit-normal Noncentral beta raised cosine Reciprocal Triangular U-quadratic uniform Wigner semicircle

Continuous univariate supported on a semi-infinite interval	Benini Benktander 1st kind Benktander 2nd kind Beta prime Burr chi-squared chi Dagum Davis exponential-logarithmic Erlang exponential F folded normal Flory-Schulz Fréchet Gamma Gamma/Gompertz generalized inverse Gaussian Gompertz half-logistic half-normal Hotelling's T-squared hyper-Erlang hyperexponential hypoexponential inverse chi-squared scaled inverse chi-squared inverse Gaussian inverse gamma Kolmogorov Lévy log-Cauchy log-Laplace log-logistic log-normal matrix-exponential Maxwell–Boltzmann Maxwell–Jüttner Mittag-Leffler Nakagami noncentral chi-squared Pareto phase-type Poly-Weibull Rayleigh relativistic Breit–Wigner Rice shifted Gompertz truncated normal type-2 Gumbel Weibull Wilks's lambda

Continuous univariate supported on the whole real line	Cauchy exponential power Fisher's z generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson S_U Landau Laplace asymmetric Laplace logistic noncentral t normal (Gaussian) normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy–Widom variance-gamma Voigt

Continuous univariate with support whose type varies	generalized extreme value generalized Pareto Tukey lambda q-Gaussian q-exponential q-Weibull shifted log-logistic

Mixed continuous-discrete univariate	rectified Gaussian

Multivariate (joint)	Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart

Directional	Univariate (circular) directional Circular uniform univariate von Mises wrapped normal wrapped Cauchy wrapped exponential wrapped asymmetric Laplace wrapped Lévy Bivariate (spherical) Kent Bivariate (toroidal) bivariate von Mises Multivariate von Mises–Fisher Bingham

Degenerate and singular	Degenerate Dirac delta function Singular Cantor

Families	Circular compound Poisson elliptical exponential natural exponential location-scale maximum entropy mixture Pearson Tweedie wrapped

This article is issued from Wikipedia - version of the Monday, March 14, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.