Description Usage Format Details References Examples
Data with 8 inputs and one output used to illustrate the prediction problem and regression in the textbook of Hastie, Tibshirani and Freedman (2009).
1 |
A data frame with 97 observations, 9 inputs and 1 output. All input variables have been standardized.
lcavollog-cancer volume
lweightlog prostate weight
ageage in years
lbphlog benign prostatic hyperplasia
sviseminal vesicle invasion
lcplog of capsular penetration
gleasonGleason score
pgg45percent of Gleascores 4/5
lpsaOutcome. Log of PSA
trainTRUE or FALSE
A study of 97 men with prostate cancer examined the correlation between PSA (prostate specific antigen) and a number of clinical measurements: lcavol, lweight, lbph, svi, lcp, gleason, pgg45
Hastie, Tibshirani & Friedman. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd Ed. Springer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | #Prostate data. Table 3.3 HTF.
data(zprostate)
#full dataset
trainQ<-zprostate[,10]
train <-zprostate[trainQ,-10]
test <-zprostate[!trainQ,-10]
ans<-lm(lpsa~., data=train)
sig<-summary(ans)$sigma
yHat<-predict(ans, newdata=test)
yTest<-zprostate$lpsa[!trainQ]
TE<-mean((yTest-yHat)^2)
#subset
ansSub<-bestglm(train, IC="BICq")$BestModel
sigSub<-summary(ansSub)$sigma
yHatSub<-predict(ansSub, newdata=test)
TESub<-mean((yTest-yHatSub)^2)
m<-matrix(c(TE,sig,TESub,sigSub), ncol=2)
dimnames(m)<-list(c("TestErr","Sd"),c("LS","Best"))
m
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.