An artificial data set which causes stepwise regression procedures to select a non-parsimonious model. The true model is a simple linear regression of y against x8.
A data frame with 50 observations on 10 variables.
Inspired by the pathoeg data set in the MPV pacakge.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
data(artificialeg) full.mod = lm(y~.,data=artificialeg) step(full.mod) # generating model n=50 set.seed(8) # a seed of 2 also works x1 = rnorm(n,0.22,2) x7 = 0.5*x1 + rnorm(n,0,sd=2) x6 = -0.75*x1 + rnorm(n,0,3) x3 = -0.5-0.5*x6 + rnorm(n,0,2) x9 = rnorm(n,0.6,3.5) x4 = 0.5*x9 + rnorm(n,0,sd=3) x2 = -0.5 + 0.5*x9 + rnorm(n,0,sd=2) x5 = -0.5*x2+0.5*x3+0.5*x6-0.5*x9+rnorm(n,0,1.5) x8 = x1 + x2 -2*x3 - 0.3*x4 + x5 - 1.6*x6 - 1*x7 + x9 +rnorm(n,0,0.5) y = 0.6*x8 + rnorm(n,0,2) artificialeg = round(data.frame(x1,x2,x3,x4,x5,x6,x7,x8,x9,y),1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.