Smooth maximum

In mathematics, a smooth maximum of an indexed family x₁, ..., x_n of numbers is a differentiable approximation to the maximum function

\{x_1,\ldots,x_n\} \mapsto \max\{x_1,\ldots,x_n\}, \,

and the concept of smooth minimum is similarly defined.

For large positive values of the parameter $\alpha > 0$ , the following formulation is one smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.

\mathcal{S}_\alpha (\{x_i\}_{i=1}^n) = \frac{\sum_{i=1}^n x_i e^{\alpha x_i}}{\sum_{i=1}^n e^{\alpha x_i}}

$\mathcal{S}_\alpha$ has the following properties:

$\mathcal{S}_\alpha\to \max$ as $\alpha\to\infty$
$\mathcal{S}_0$ is the average of its inputs
$\mathcal{S}_\alpha\to \min$ as $\alpha\to -\infty$

The gradient of $\mathcal{S}_{\alpha}$ is closely related to softmax and is given by

\nabla_{x_i}\mathcal{S}_\alpha (\{x_i\}_{i=1}^n) = \frac{e^{\alpha x_i}}{\sum_{j=1}^n e^{\alpha x_j}} [1 + \alpha(x_i - \mathcal{S}_\alpha (\{x_i\}_{i=1}^n))].

This makes the softmax function useful for optimization techniques that use gradient descent.

Another formulation is:

g(x_1, \ldots, x_n) = \log( \exp(x_1) + \ldots + \exp(x_n) )

References

This article is issued from Wikipedia - version of the Monday, April 13, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.