same as step() in R, but able to check marginal effects.

Share:

Description

same as step() in R, but able to check marginal effects.

Usage

1
2
3
stepwise2(model, scope, trace = 1, steps = 1000, k = 2, data,
  family = NULL, IC_method = c("AIC", "BIC"), test_suit = NULL,
  STOP = FALSE)

Arguments

model

an output of lm or glm

scope, trace, steps, k

see step()

data

a data.frame used in regression.

family

used as the argument for family of glm, default is NULL, which means we will use the family imbedded in the model.

IC_method

either 'AIC' or 'BIC', will overwrite the k argument above.

test_suit

used to specify the correct marginal effect you want to check. It is a list with names as raw variable and values as arguments of the function deleting_wrongeffect If NULL (default), then not check any marginal effect See example code for details.

STOP

whether stop and wait your response for each step.

Details

For each step of regression, you can first choose the models with correct marginal effect and then choose the one with highest AIC/BIC within them

Value

a stepwise-selected model. If test_suit is specified, then the returned model is the one with highest AIC/BIC within those that get correct marginal impact.

The silde effect is to print a data.frame containing diagnostic informations for each step. The 'correct_effect_ind' column is a boolean vector to show whether the model has correct marginal effect or not.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# starting model:
# can have a dirty formula like below

set.seed(413)
traing_data = ggplot2::diamonds[runif(nrow(ggplot2::diamonds))<0.05,]
nrow(traing_data)

diamond_lm3 = lm(formula = price ~ cut + carat - cut   , data = traing_data)

scope = list(lower = price ~ 1,
             upper = price ~  I(carat^2) + I(carat^3) + I(carat * depth) + depth + carat)

# traditional stepwise regression with no marginal effect check
model1 = stepwise2(model = diamond_lm3, scope = scope,k = 2,
                   trace = TRUE, data = traing_data, STOP = TRUE)
model1
# result is exactly same using the default step() function.
model2 = suppressWarnings(step(diamond_lm3,scope = scope, k = 2))
model2


#__ How to Specify the Correct Marginal Effects in Stepwise Regression  __

# this test_suit means we will check the marginal effect of both 'carat' and 'depth'
# for 'carat', we will only focus on 4 coeff vars :
    # "I(carat^3)","I(carat*depth)","I(carat^2)","carat"
# for 'depth', as we do not specify any particular coeff vars there,
# we will check all coeff var related to 'depth'

test_suit = list(
  carat = list(
    # the list name must be the raw var
    focus_var_raw = "carat",
    # must specify the focus_var_raw (see deleting_wrongeffect() ) as the raw var
    focus_var_coeff = c("I(carat^3)","I(carat*depth)",
                        "I(carat^2)","carat") ,
    # optional # If not defined, then we to check all coeffs related to the raw var
    focus_value =list(carat = seq(0.5,6,0.1)),
    Monoton_to_Match = 1 # optional. Default is 1
  ),
  depth = list(
    focus_var_raw = "depth",
    Monoton_to_Match = 1
  )
)

model3 =  stepwise2(model = diamond_lm3, scope = scope, trace = TRUE,
                    data = traing_data,
                    STOP = FALSE, test_suit = test_suit)

# see the difference from model1
effect(model3,focus_var_raw =  "carat", focus_value =list(carat = seq(0.5,6,0.1)))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.