Freedman–Diaconis rule

In statistics, the Freedman–Diaconis rule, named after David A. Freedman and Persi Diaconis, can be used to select the size of the bins to be used in a histogram.^[1] The general equation for the rule is:

\text{Bin size}=2\, { \text{IQR}(x) \over{ \sqrt[3]{n} }}\;

where $\scriptstyle\operatorname{IQR}(x) \;$ is the interquartile range of the data and $\scriptstyle n \;$ is the number of observations in the sample $\scriptstyle x. \;$

Other approaches

Another approach is to use Sturges' rule: use a bin so large that there are about $\scriptstyle 1+\log_2n$ non-empty bins (Scott, 2009).^[2] This works well for n under 200, but was found to be inaccurate for large n.^[3] For a discussion and an alternative approach, see Birgé and Rozenholc.^[4]

References

↑ Freedman, David; Diaconis, Persi (December 1981). "On the histogram as a density estimator: L₂ theory" (PDF). Probability Theory and Related Fields (Heidelberg: Springer Berlin) 57 (4): 453–476. ISSN 0178-8051. Retrieved 2009-01-06.
↑ Scott, D.W. (2009). "Sturges' rule". WIREs Computational Statistics 1: 303–306. doi:10.1002/wics.35.
↑ Hyndman, R.J. (1995). "The problem with Sturges’ rule for constructing histograms" (PDF).
↑ Birgé, L.; Rozenholc, Y. (2006). "How many bins should be put in a regular histogram". ESAIM: Probability and Statistics 10: 24–45. doi:10.1051/ps:2006001.

This article is issued from Wikipedia - version of the Friday, February 05, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.