knitr::opts_chunk$set(echo = TRUE, include = TRUE)

Reducing Number of Predictors In Predictive Analytics {.build}

Thought Experiment | Find some lucky pennies

Imagine that you have 50 pennies labeled 1 through 50 and you flip each one 10 times and write on each coin how many times it landed head side up.

Set It Up {.build}

set.seed(123)
n_coins <- 50
tosses <- 10
p_heads <- 0.5
results <- rbinom(n_coins, tosses, p_heads)
I should have said, that we are looking for some lucky pennies, because I going to use the lucky pennies we find in a magic trick later. To find a lucky penny, I need to decide how many heads I need to observe to be surprised. I am going to cheat, because I happen to remember just enough theoretical statistics to know that when you observe a group of boolean trials, you can use the binomial distribution to calculated the probability of observing any number of one type of observation (heads) out of a known number of trials. (We are not going to show the equation, but it's a combinatorial expression with factorials so it is pretty.) For the sake of brevity, I am going to tell you it takes a lot to surprise me.

What Is Surprising {.build}

dbinom(7, tosses, p_heads)
dbinom(8, tosses, p_heads)
dbinom(9, tosses, p_heads)
dbinom(10, tosses, p_heads)

This Is Surprising Enough (I think ??)

surprise_threshhold <- 8
I am going to be a traditionalist here and decide I will be surprised if there are `r surprise_threshhold` or more heads out of `r tosses` tosses.

Being the Geeks We Are {.build}

picks <- seq_along(results)[results >= surprise_threshhold]
picks
results[results >= surprise_threshhold]

What Did We Just Accomplish? {.build}

I can confidently carry these pennies into my magic show knowing they will continue to perform as well there as they did here.

But We are Statisticians

tosses_large <- 1000
pbinom(800, tosses_large, 0.5, lower.tail = FALSE)
pbinom(550, tosses_large, 0.5, lower.tail = FALSE)
pbinom(525, tosses_large, 0.5, lower.tail = FALSE)
pbinom(526, tosses_large, 0.5, lower.tail = FALSE)
Now as statisticians, `r tosses` is not a big enough sample for a good estimate so let's make it bigger.

We're Still Surprised

surprise_threshhold_large <- 526
results_large <- rbinom(n_coins, tosses_large, p_heads)
picks_large <- seq_along(results_large)[results_large >= surprise_threshhold_large]
length(picks_large)
picks_large
results_large[results_large >= surprise_threshhold_large]

What Have We Seen? {.build}

It is not the initial sample, the results of flipping the 50 coins. It is the belief that the coins that behaved in a surprising way will continue to behave in a surprising way.

Stepwise Methods {.build}

Stepwise Methods {.build}

Stepwise Methods {.build}

Stepwise Methods {.build}

Statistical Issues {.build}

Effects Demonstrated in 1970's and 1980's {.build}

Effects Demonstrated in 1992 (1 of 6) {.build}

Effects Demonstrated in 1992 (2 of 6)

Predictors | Noise | %Noise -----------|--------|------- 12 | 0.43 | 20 18 | 0.96 | 40 24 | 1.44 | 46

Effects Demonstrated in 1992 (3 of 6)

Predictors | Noise | %Noise -----------|--------|------- 12 | 0.47 | 35 18 | 0.93 | 59 24 | 1.36 | 62

Effects Demonstrated in 1992 (4 of 6)

Predictors | Actual | Noise | %Noise -----------|--------|--------|------- 12 | 1.70 | 0.43 | 20 18 | 1.64 | 0.96 | 40 24 | 1.66 | 1.44 | 46

Effects Demonstrated in 1992 (5 of 6)

Predictors | Actual | Noise | %Noise -----------|--------|--------|------- 12 | 0.86 | 0.47 | 35 18 | 0.87 | 0.93 | 59 24 | 0.83 | 1.36 | 62

Effects Demonstrated in 1992 (6 of 6) {.build}

What If You Have Great Data

knitr::include_graphics("images/actual_vs_Y.png")

And Noisy Noise

knitr::include_graphics("images/noise_vs_Y.png")

You Pick Noise

knitr::include_graphics("images/selected_noise_vs_Y.png")

What If You Change Thresholds

$N = 900$

Predictors | Noise $\alpha = 0.0016$ |Noise $\alpha = 0.15$ -----------|-------------------------|---------------------- 12 | 2 | 2 18 | 3 | 3 24 | 4 | 4 50 | 8 | 10 100 | 16 | 17

Options {.build}

Lasso

$min||\mathbf{y - X\beta}||^2$ subject to $\sum\limits_{j=1}^{m} |\beta_{j}| \leq t$

Elastic Net

$min||\mathbf{y - X\beta}||^2$ subject to $\sum\limits_{j=1}^{m} |\beta_{j}| \leq t_1$, $\sum\limits_{j=1}^{m} \beta_{j}^{2} \leq t_2$



rmsharp/stepwiser documentation built on May 26, 2019, 9:33 a.m.