Quasi-Monte Carlo method

[Pseudorandom sequence]
[Low-discrepancy sequence (Sobol sequence)]
256 points from a pseudorandom number source, Halton sequence, and Sobol sequence (red=1,..,10, blue=11,..,100, green=101,..,256). Points from Sobol sequence are more evenly distributed.

In numerical analysis, quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences (also called quasi-random sequences or sub-random sequences). This is in contrast to the regular Monte Carlo method or Monte Carlo integration, which are based on sequences of pseudorandom numbers.

Monte Carlo and quasi-Monte Carlo methods are stated in a similar way. The problem is to approximate the integral of a function f as the average of the function evaluated at a set of points x1, ..., xN:

 \int_{[0,1]^s} f(u)\,{\rm d}u \approx \frac{1}{N}\,\sum_{i=1}^N f(x_i).

Since we are integrating over the s-dimensional unit cube, each xi is a vector of s elements. The difference between quasi-Monte Carlo and Monte Carlo is the way the xi are chosen. Quasi-Monte Carlo uses a low-discrepancy sequence such as the Halton sequence, the Sobol sequence, or the Faure sequence, whereas Monte Carlo uses a pseudorandom sequence. The advantage of using low-discrepancy sequences is a faster rate of convergence. Quasi-Monte Carlo has a rate of convergence close to O(1/N), whereas the rate for the Monte Carlo method is O(N−0.5).[1]

The Quasi-Monte Carlo method recently became popular in the area of mathematical finance or computational finance.[1] In these areas, high-dimensional numerical integrals, where the integral should be evaluated within a threshold ε, occur frequently. Hence, the Monte Carlo method and the quasi-Monte Carlo method are beneficial in these situations.

Approximation error bounds of quasi-Monte Carlo

The approximation error of the quasi-Monte Carlo method is bounded by a term proportional to the discrepancy of the set x1, ..., xN. Specifically, the Koksma-Hlawka inequality states that the error

 \epsilon = \left| \int_{[0,1]^s} f(u)\,{\rm d}u - \frac{1}{N}\,\sum_{i=1}^N f(x_i) \right|

is bounded by

 |\epsilon| \leq V(f) D_N ,

where V(f) is the Hardy-Krause variation of the function f (see Morokoff and Caflisch (1995) [2] for the detailed definitions). DN is the discrepancy of the set (x1,...,xN) and is defined as

 D_N = \sup_{Q \subset [0,1]^s} \left| \frac{\mbox{number of points in } Q}{N} - \mbox{volume}(Q)\right| ,

where Q is a rectangular solid in [0,1]s with sides parallel to the coordinate axes.[2] The inequality  |\epsilon| \leq V(f) D_N can be used to show that the error of the approximation by the quasi-Monte Carlo method is  O\left(\frac{(\log N)^s}{N}\right) , whereas the Monte Carlo method has a probabilistic error of  O\left(\frac{1}{\sqrt{N}}\right) . Though we can only state the upper bound of the approximation error, the convergence rate of quasi-Monte Carlo method in practice is usually much faster than its theoretical bound.[1] Hence, in general, the accuracy of the quasi-Monte Carlo method increases faster than that of the Monte Carlo method.

Monte Carlo and quasi-Monte Carlo for multidimensional integrations

For one-dimensional integration, quadrature methods such as the trapezoidal rule, Simpson's rule, or Newton–Cotes formulas are known to be efficient if the function is smooth. These approaches can be also used for multidimensional integrations by repeating the one-dimensional integrals over multiple dimensions. Cubature is one of the well known packages using quadrature methods that work great for low-dimensional integration. However, the number of function evaluations grow exponentially as s, the number of dimensions, increases. Hence, a method that can overcome this curse of dimensionality should be used for multidimensional integrations. The standard Monte Carlo method is frequently used when the quadrature methods are difficult or expensive to implement.[2] Monte Carlo and quasi-Monte Carlo methods are accurate and fast when the dimension is high, up to 300 or higher.[3]

Morokoff and Caflisch [2] studied the performance of Monte Carlo and quasi-Monte Carlo methods for integration. In the paper, Halton, Sobol, and Faure sequences for quasi-Monte Carlo are compared with the standard Monte Carlo method using pseudorandom sequences. They found that the Halton sequence performs best for dimensions up to around 6; the Sobol sequence performs best for higher dimensions; and the Faure sequence, while outperformed by the other two, still performs better than a pseudorandom sequence.

However, Morokoff and Caflisch [2] gave examples where the advantage of the quasi-Monte Carlo is less than expected theoretically. Still, in the examples studied by Morokoff and Caflisch, the quasi-Monte Carlo method did yield a more accurate result than the Monte Carlo method with the same number of points. Morokoff and Caflisch remark that the advantage of the quasi-Monte Carlo method is greater if the integrand is smooth, and the number of dimensions s of the integral is small.

Drawbacks of quasi-Monte Carlo

Lemieux mentioned the drawbacks of quasi-Monte Carlo:[4]

In order to overcome these difficulties, we can use a randomized quasi-Monte Carlo method.

Randomization of quasi-Monte Carlo

Since the low discrepancy sequence are not random, but deterministic, quasi-Monte Carlo method can be seen as a deterministic algorithm or derandomized algorithm. In this case, we only have the bound (e.g., ε ≤ V(f) DN) for error, and the error is hard to estimate. In order to recover our ability to analyze and estimate the variance, we can randomize the method (see randomization for the general idea). The resulting method is called the randomized quasi-Monte Carlo method and can be also viewed as a variance reduction technique for the standard Monte Carlo method.[5] Among several methods, the simplest transformation procedure is through random shifting. Let {x1,...,xN} be the point set from the low discrepancy sequence. We sample s-dimensional random vector U and mix it with {x1,...,xN}. In detail, for each xj, create

 y_{j} = x_{j} + U \pmod 1

and use the sequence (y_{j}) instead of (x_{j}). If we have R replications for Monte Carlo, sample s-dimensional random vector U for each replication. The drawback of randomization is the sacrifice of computation speed. Since we now use a pseudorandom number generator, the method is slower. Still, randomization is useful since the variance and the computation speed are slightly better than that of standard Monte Carlo, from the experimental results in Tuffin (2008) [6]

See also

References

  1. 1 2 3 Søren Asmussen and Peter W. Glynn, Stochastic Simulation: Algorithms and Analysis, Springer, 2007, 476 pages
  2. 1 2 3 4 5 William J. Morokoff and Russel E. Caflisch, Quasi-Monte Carlo integration, J. Comput. Phys. 122 (1995), no. 2, 218--230. (At CiteSeer: )
  3. Rudolf Schürer, A comparison between (quasi-)Monte Carlo and cubature rule based methods for solving high-dimensional integration problems, Mathematics and Computers in Simulation, Volume 62, Issues 3–6, 3 March 2003, 509–517
  4. Christiane Lemieux, Monte Carlo and Quasi-Monte Carlo Sampling, Springer, 2009, ISBN 978-1441926760
  5. Moshe Dror, Pierre L’Ecuyer and Ferenc Szidarovszky, Modeling Uncertainty: An Examination of Stochastic Theory, Methods, and Applications, Springer 2002, pp. 419-474
  6. Bruno Tuffin, Randomization of Quasi-Monte Carlo Methods for Error Estimation: Survey and Normal Approximation, Monte Carlo Methods and Applications mcma. Volume 10, Issue 3-4, Pages 617–628, ISSN (Online) 1569-3961, ISSN (Print) 0929-9629, DOI: 10.1515/mcma.2004.10.3-4.617, May 2008

External links

This article is issued from Wikipedia - version of the Sunday, July 19, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.