Isotonic regression

Regression analysis
Part of a series on Statistics

Models
Linear regression Simple regression Ordinary least squares Polynomial regression General linear model
Generalized linear model Discrete choice Logistic regression Multinomial logit Mixed logit Probit Multinomial probit Ordered logit Ordered probit Poisson
Multilevel model Fixed effects Random effects Mixed model
Nonlinear regression Nonparametric Semiparametric Robust Quantile Isotonic Principal components Least angle Local Segmented
Errors-in-variables
Estimation
Least squares Ordinary least squares Linear (math) Partial Total Generalized Weighted Non-linear Non-negative Iteratively reweighted Ridge regression
Least absolute deviations Bayesian Bayesian multivariate
Background
Regression model validation Mean and predicted response Errors and residuals Goodness of fit Studentized residual Gauss–Markov theorem
Statistics portal

An example of isotonic regression

In numerical analysis, isotonic regression (IR) involves finding a weighted least-squares fit $x\in \Bbb{R}^n$ to a vector $a\in \Bbb{R}^n$ with weights vector $w\in \Bbb{R}^n$ subject to a set of non-contradictory constraints of the kind $x_i \ge x_j$ .

Such constraints define partial order or total order and can be represented as a directed graph $G=(N,E)$ , where N is the set of variables involved, and E is the set of pairs (i, j) for each constraint $x_i \ge x_j$ . Thus, the IR problem corresponds to the following quadratic program (QP):

\min \sum_{i=1}^n w_i (x_i - a_i)^2

\text{subject to }x_i\ge x_j~ \text{ for all } (i,j)\in E.

In the case when $G=(N,E)$ is a total order, a simple iterative algorithm for solving this QP is called the pool adjacent violators algorithm (PAVA). Best and Chakravarti (1990) have studied the problem as an active set identification problem, and have proposed a primal algorithm in O(n), the same complexity as the PAVA, which can be seen as a dual algorithm.^[1]

IR has applications in statistical inference, for example, to fit of an isotonic curve to mean experimental results when an order is expected. A benefit of isotonic regression is that it does not assume any form for the target function, such as linearity assumed by linear regression.

Another application is nonmetric multidimensional scaling,^[2] where a low-dimensional embedding for data points is sought such that order of distances between points in the embedding matches order of dissimilarity between points. Isotonic regression is used iteratively to fit ideal distances to preserve relative dissimilarity order.

Isotonic regression is also sometimes referred to as monotonic regression. Correctly speaking, isotonic is used when the direction of the trend is increasing, while monotonic could imply a trend that is either increasing or strictly decreasing.

Isotonic regression under the $L_p$ for $p>0$ is defined as follows:

\min \sum_{i=1}^n w_i |x_i - a_i|^p

\mathrm{subject~to~}x_i\ge x_j~ \text{ for all } (i,j)\in E.

Simply ordered case

To illustrate the above, let $x_1 \leq x_2 \leq \ldots \leq x_n$ , and $f(x_1) \leq f(x_2) \leq \ldots \leq f(x_n)$ , and $w_i \geq 0$ .

The isotonic estimator, $g^*$ , minimizes the weighted least squares-like condition:

\min_g \sum_{i=1}^n w_i (g(x_i) - f(x_i))^2

Where $g$ is the unknown function we are estimating, and $f$ is a known function.

Software has been developed in the R statistical package for computing isotone (monotonic) regression. ^[3]

References

↑ Best, M.J.; & Chakravarti N. (1990). "Active set algorithms for isotonic regression; a unifying framework". Mathematical Programming 47: 425–439. doi:10.1007/BF01580873.
↑ Kruskal, J. B. (1964). "Nonmetric Multidimensional Scaling: A numerical method". Psychometrika 29 (2): 115–129. doi:10.1007/BF02289694.
↑ Leeuw, Jan de; Hornik, Kurt; Mair, Patrick (2009). "Isotone Optimization in R: Pool-Adjacent-Violators Algorithm (PAVA) and Active Set Methods". Journal of Statistical Software 32 (5): 1–24. doi:10.18637/jss.v032.i05. ISSN 1548-7660.

Robertson, T.; Wright, F. T.; Dykstra, R. L. (1988). Order restricted statistical inference. New York: Wiley. ISBN 0-471-91787-7.
Barlow, R. E.; Bartholomew, D. J.; Bremner, J. M.; Brunk, H. D. (1972). Statistical inference under order restrictions; the theory and application of isotonic regression. New York: Wiley. ISBN 0-471-04970-0.
Shively, T.S., Sager, T.W., Walker, S.G. (2009). "A Bayesian approach to non-parametric monotone function estimation". Journal of the Royal Statistical Society, Series B 71 (1): 159–175. doi:10.1111/j.1467-9868.2008.00677.x.
Wu, W. B.; Woodroofe, M.; & Mentz, G. (2001). "Isotonic regression: Another look at the changepoint problem". Biometrika 88 (3): 793–804. doi:10.1093/biomet/88.3.793.

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Sunday, January 17, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Isotonic regression

Simply ordered case

References

Further reading