Smooth maximum

In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a differentiable approximation to the maximum function

 \{x_1,\ldots,x_n\} \mapsto \max\{x_1,\ldots,x_n\}, \,

and the concept of smooth minimum is similarly defined.

For large positive values of the parameter \alpha > 0, the following formulation is one smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.


\mathcal{S}_\alpha (\{x_i\}_{i=1}^n) = \frac{\sum_{i=1}^n x_i e^{\alpha x_i}}{\sum_{i=1}^n e^{\alpha x_i}}

\mathcal{S}_\alpha has the following properties:

  1. \mathcal{S}_\alpha\to \max as \alpha\to\infty
  2. \mathcal{S}_0 is the average of its inputs
  3. \mathcal{S}_\alpha\to \min as \alpha\to -\infty

The gradient of \mathcal{S}_{\alpha} is closely related to softmax and is given by


\nabla_{x_i}\mathcal{S}_\alpha (\{x_i\}_{i=1}^n) = \frac{e^{\alpha x_i}}{\sum_{j=1}^n e^{\alpha x_j}} [1 + \alpha(x_i - \mathcal{S}_\alpha (\{x_i\}_{i=1}^n))].

This makes the softmax function useful for optimization techniques that use gradient descent.

Another formulation is:


g(x_1, \ldots,  x_n) = \log( \exp(x_1) + \ldots + \exp(x_n) )

References


    This article is issued from Wikipedia - version of the Monday, April 13, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.