# Projection Pursuit Regression

### Description

Fit a projection pursuit regression model.

### Usage

1 2 3 4 5 6 7 8 9 10 11 | ```
ppr(x, ...)
## S3 method for class 'formula'
ppr(formula, data, weights, subset, na.action,
contrasts = NULL, ..., model = FALSE)
## Default S3 method:
ppr(x, y, weights = rep(1, n),
ww = rep(1, q), nterms, max.terms = nterms, optlevel = 2,
sm.method = c("supsmu", "spline", "gcvspline"),
bass = 0, span = 0, df = 5, gcvpen = 1, ...)
``` |

### Arguments

`formula` |
a formula specifying one or more numeric response variables and the explanatory variables. |

`x` |
numeric matrix of explanatory variables. Rows represent observations, and columns represent variables. Missing values are not accepted. |

`y` |
numeric matrix of response variables. Rows represent observations, and columns represent variables. Missing values are not accepted. |

`nterms` |
number of terms to include in the final model. |

`data` |
a data frame (or similar: see |

`weights` |
a vector of weights |

`ww` |
a vector of weights for each |

`subset` |
an index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |

`na.action` |
a function to specify the action to be taken if |

`contrasts` |
the contrasts to be used when any factor explanatory variables are coded. |

`max.terms` |
maximum number of terms to choose from when building the model. |

`optlevel` |
integer from 0 to 3 which determines the thoroughness of an optimization routine in the SMART program. See the ‘Details’ section. |

`sm.method` |
the method used for smoothing the ridge functions. The default is to
use Friedman's super smoother Can be abbreviated. |

`bass` |
super smoother bass tone control used with automatic span selection
(see |

`span` |
super smoother span control (see |

`df` |
if |

`gcvpen` |
if |

`...` |
arguments to be passed to or from other methods. |

`model` |
logical. If true, the model frame is returned. |

### Details

The basic method is given by Friedman (1984), and is essentially the
same code used by S-PLUS's `ppreg`

. This code is extremely
sensitive to the compiler used.

The algorithm first adds up to `max.terms`

ridge terms one at a
time; it will use less if it is unable to find a term to add that makes
sufficient difference. It then removes the least
important term at each step until `nterms`

terms
are left.

The levels of optimization (argument `optlevel`

)
differ in how thoroughly the models are refitted during this process.
At level 0 the existing ridge terms are not refitted. At level 1
the projection directions are not refitted, but the ridge
functions and the regression coefficients are.
Levels 2 and 3 refit all the terms and are equivalent for one
response; level 3 is more careful to re-balance the contributions
from each regressor at each step and so is a little less likely to
converge to a saddle point of the sum of squares criterion.

### Value

A list with the following components, many of which are for use by the method functions.

`call` |
the matched call |

`p` |
the number of explanatory variables (after any coding) |

`q` |
the number of response variables |

`mu` |
the argument |

`ml` |
the argument |

`gof` |
the overall residual (weighted) sum of squares for the selected model |

`gofn` |
the overall residual (weighted) sum of squares against the
number of terms, up to |

`df` |
the argument |

`edf` |
if |

`xnames` |
the names of the explanatory variables |

`ynames` |
the names of the response variables |

`alpha` |
a matrix of the projection directions, with a column for each ridge term |

`beta` |
a matrix of the coefficients applied for each response to the ridge terms: the rows are the responses and the columns the ridge terms |

`yb` |
the weighted means of each response |

`ys` |
the overall scale factor used: internally the responses are
divided by |

`fitted.values` |
the fitted values, as a matrix if |

`residuals` |
the residuals, as a matrix if |

`smod` |
internal work array, which includes the ridge functions evaluated at the training set points. |

`model` |
(only if |

### Source

Friedman (1984): converted to double precision and added interface to smoothing splines by B. D. Ripley, originally for the MASS package.

### References

Friedman, J. H. and Stuetzle, W. (1981)
Projection pursuit regression.
*Journal of the American Statistical Association*,
**76**, 817–823.

Friedman, J. H. (1984) SMART User's Guide. Laboratory for Computational Statistics, Stanford University Technical Report No. 1.

Venables, W. N. and Ripley, B. D. (2002)
*Modern Applied Statistics with S.* Springer.

### See Also

`plot.ppr`

, `supsmu`

, `smooth.spline`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | ```
require(graphics)
# Note: your numerical values may differ
attach(rock)
area1 <- area/10000; peri1 <- peri/10000
rock.ppr <- ppr(log(perm) ~ area1 + peri1 + shape,
data = rock, nterms = 2, max.terms = 5)
rock.ppr
# Call:
# ppr.formula(formula = log(perm) ~ area1 + peri1 + shape, data = rock,
# nterms = 2, max.terms = 5)
#
# Goodness of fit:
# 2 terms 3 terms 4 terms 5 terms
# 8.737806 5.289517 4.745799 4.490378
summary(rock.ppr)
# ..... (same as above)
# .....
#
# Projection direction vectors:
# term 1 term 2
# area1 0.34357179 0.37071027
# peri1 -0.93781471 -0.61923542
# shape 0.04961846 0.69218595
#
# Coefficients of ridge terms:
# term 1 term 2
# 1.6079271 0.5460971
par(mfrow = c(3,2)) # maybe: , pty = "s")
plot(rock.ppr, main = "ppr(log(perm)~ ., nterms=2, max.terms=5)")
plot(update(rock.ppr, bass = 5), main = "update(..., bass = 5)")
plot(update(rock.ppr, sm.method = "gcv", gcvpen = 2),
main = "update(..., sm.method=\"gcv\", gcvpen=2)")
cbind(perm = rock$perm, prediction = round(exp(predict(rock.ppr)), 1))
detach()
``` |