knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(stat302package)

Introductory

The PROJECT3 package is a final project for STAT 302. It can be used for a variety of uses including linear modeling, k-nearest-neighbors, t-tests, and Random Forest Cross Validation. To install the package from github, run the following code.

Install Package from Github

setwd("..")
devtools::install("stat302package")
library(stat302package)
?my_pow

devtools::install_github("edwardsung63/stat302package")
library(stat302package)
?my_pow

library(stat302package)

Tutorial for my_t_test()

All the tests demonstrated below use lifeExp data from my_gapminder.

case I: alternative = "two.sided"

\begin{align} H_0: \mu &= 60,\ H_a: \mu &\neq 60. \end{align}

data("my_gapminder")
my_t_test(my_gapminder$lifeExp, alternative = "two.sided", mu = 60)

From the test result, we notice that the p_value is greater than 0.05. Thus, it is not statistically significant have no enough evidence to reject the null hypothesis.

case II: alternative = "less"

\begin{align} H_0: \mu &= 60,\ H_a: \mu &< 60. \end{align}

my_t_test(my_gapminder$lifeExp, alternative = "less", mu = 60)

From the test result, we notice that the p_value is less than 0.05. Thus, it is not statistically significant have no enough evidence to reject the null hypothesis.

case III: alternative = "greater"

\begin{align} H_0: \mu &= 60,\ H_a: \mu &> 60. \end{align}

my_t_test(my_gapminder$lifeExp, alternative = "greater", mu = 60)

From the test result, we notice that the p_value is greater than 0.05. Thus, it is not statistically significant have no enough evidence to reject the null hypothesis.

Tutorial for my_lm()

test <- my_lm(my_fml = lifeExp ~ gdpPercap + continent, my_data = my_gapminder)
my_coef <- test[1]
my_matrix <- cbind(1, my_gapminder$gdpPercap)
y_hat <- my_matrix %*% as.matrix(my_coef)
plot(my_gapminder$lifeExp, y_hat)
expect_is(my_lm(pop ~ gdpPercap, my_data = my_gapminder), "table")

As shown, we notice that the difference of lifeExp between two observations is an unit in gdpPercap. Compared to the coeffiecients of different continents, gdpPercap has less influence on lifeExp than continent.

Tutorial for my_knn_cv()

tutor_knn <- my_knn_cv(train = my_gapminder[, 3 : 4], 
                       cl = my_gapminder$continent, k_nn = 10, k_cv = 5)

Tutorial for my_rf_cv()

k_vec <- c(4, 10, 20)
tutor_rf <- matrix(NA, nrow = 30, ncol = 4)
for (i in 1 : length(k_vec)) {
  for (k in 1 : 30) {
    tutor_rf[k, i] <- my_rf_cv(k = k_vec[i])
  }
}

boxplots

df <- data.frame("k_value" = tutor_rf[, 1], "MSE" = tutor_rf[, 2])
boxplot_df <- ggplot(data = df,
       aes(x = k_value, y = MSE, group = k_value)) +
  geom_boxplot(fill = "red") +
  theme_bw(base_size = 20) +
  labs(title = "estimated MSE for each k",
       x = "k values",
       y = "estimated MSE") + 
  theme(plot.title = element_text(hjust = 0.5, face = "bold")) +
  scale_x_continuous(breaks = c(4, 10, 20)) 
boxplot_df


edwardsung63/stat302package documentation built on March 16, 2020, 6:12 a.m.