Description Usage Format Details References Examples
Data with 8 inputs and one output used to illustrate the prediction problem and regression in the textbook of Hastie, Tibshirani and Freedman (2009).
1 |
A data frame with 97 observations, 9 inputs and 1 output. All input variables have been standardized.
lcavol
log-cancer volume
lweight
log prostate weight
age
age in years
lbph
log benign prostatic hyperplasia
svi
seminal vesicle invasion
lcp
log of capsular penetration
gleason
Gleason score
pgg45
percent of Gleascores 4/5
lpsa
Outcome. Log of PSA
train
TRUE or FALSE
A study of 97 men with prostate cancer examined the correlation between PSA (prostate specific antigen) and a number of clinical measurements: lcavol, lweight, lbph, svi, lcp, gleason, pgg45
Hastie, Tibshirani & Friedman. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd Ed. Springer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | #Prostate data. Table 3.3 HTF.
data(zprostate)
#full dataset
trainQ<-zprostate[,10]
train <-zprostate[trainQ,-10]
test <-zprostate[!trainQ,-10]
ans<-lm(lpsa~., data=train)
sig<-summary(ans)$sigma
yHat<-predict(ans, newdata=test)
yTest<-zprostate$lpsa[!trainQ]
TE<-mean((yTest-yHat)^2)
#subset
ansSub<-bestglm(train, IC="BICq")$BestModel
sigSub<-summary(ansSub)$sigma
yHatSub<-predict(ansSub, newdata=test)
TESub<-mean((yTest-yHatSub)^2)
m<-matrix(c(TE,sig,TESub,sigSub), ncol=2)
dimnames(m)<-list(c("TestErr","Sd"),c("LS","Best"))
m
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.