The code in this package is stored in the 'meerkat' package. It contains helper functions that are not exported and for which no visible documentation was written. At the start of each part, I give a link to the R-code hosted on GitHub where you can view the code.
View the code belonging to this assignment here
1. Assuming that the data are approximately normal, simulate one sample of data and perform a t-test using the R in-built t-test function. What is the p-value of the test? What is your conclusion?.
rm(list=ls()) library(meerkat) # Set values and draw a sample n <- 50 mu <- 160 mu0 <- 150 sd <- 15 set.seed(600) x <- rnorm(n, mu, sd) set.seed(500) y <- rnorm(n, mu0, sd) # We perform a t-test on the data t.test(x, y, alternative="greater", var.equal = TRUE)
The one-tailed test is statistically significant (p<0.001). At a significance level of a=0.05 we would therefore reject the null that there is no difference in the test scores of these groups. The experimental group , it would seem, performed better than the control and the special training has had the desired effect.
2. Assume that the data are normally distributed, write a function that generates data and performs a t-test 1000 times and that stores the values of the t-statistic (or the p-value) in a vector. Thinking about the definition of power in terms of rejecting the null hypothesis, how would you obtain an estimate of the power in this situation? Compare your results with those given by power.t.test
Power is defined as the probability $\pi(\theta)$ of rejecting the null hypothesis given that the alternative hypothesis is true.^[Rizzo, Maria L. "Statistical Computing with R". Chapman and Hall/CRC, 2007]. The probability of making a Type II error (wrongfully concluding that there is no effect while an effect exists) is $1-\pi(\theta))$. A common maximum value for the Type II error is that we should not commit it in more than 20\% of cases. Hence, the 'minimum' power we should strive for is $0.8$, meaning that we can detect effects $4$ out of $5$ times.
The function emp_power()
calculates, for a given $R$ number of repititions, the t-statistic between the mean value of the control group and that of a random sample drawn from the normal distribution. Given that we are interested in the probability $\pi(\theta)$, a reasonable approach is to take the proportion of cases where the null hypothesis is rejected. In this case, if the true difference of means is $0$, we would still expect to achieve a power of 5\% because of the Type I error rate. As we increase the difference in mean values, we expect the power (or the percentage of significant t-tests) to increase.
Furthermore, given that we draw data from a normal distribution and for values where $\theta_0 < \theta_1$, we may be confident that the power of our test increases as we:
The effect of changing these values is shown in the plots below
library(ggplot2) library(grid) library(gridExtra) # Define a grid of possible mean values from 140 to 160 grids <- list( grid_mu = seq(140, 160, 1), grid_sd = seq(5, 15, 0.5), grid_alpha = seq(0.01, 0.11, 0.005), grid_n = seq(50, 150, 5) ) grids_out <- vector("list", length = 4) # For each value, calculate the power of the t-test while holding the other parameters constant for( i in seq_along(grids) ) { # Get name grids_in_name <- names(grids)[i] grid <- grids[[i]] # This switch is a wrapper for a bunch of if/else statements power_values <- switch(grids_in_name, grid_mu = lapply(grid, function(x) emp_power(50, x, 15, R=1000, type="two_sample", alpha=0.05, alternative="greater", mu0=150)), grid_n = lapply(grid, function(x) emp_power(x, 155, 15, R=1000, type="two_sample", alpha=0.05, alternative="greater", mu0=150)), grid_sd = lapply(grid, function(x) emp_power(50, 155, x, R=1000, type="two_sample",alpha=0.05, alternative="greater", mu0=150)), grid_alpha = lapply(grid, function(x) emp_power(50, 155, 15, R=1000, type="two_sample",alpha=x, alternative="greater", mu0=150)) ) # Get values pv <- sapply(power_values, function(x) x$power) # In data frame df <- data.frame("grid" = grid, "power" = pv, "se" = sapply(power_values, function(x) x$se)) # Plot grids_out[[i]] <- ggplot(df, aes(x=grid, y=power)) + geom_line() + geom_point() + geom_errorbar(aes(ymin=power - se, ymax=power + se)) + theme_bw() + scale_x_continuous(name = strsplit(grids_in_name, "_")[[1]][2]) + scale_y_continuous(name = "Power") } # Plot grid.arrange(grids_out[[1]], grids_out[[2]], grids_out[[3]], grids_out[[4]], ncol=2, nrow=2)
If we compare the results from the empirical test to the R function, we observe:
power.t.test(n=50, delta=10, sd=15, type="two.sample", sig.level = 0.05, alternative="one.sided")
and
set.seed(700) emp <- emp_power(n=50, mu=160, mu0=150, sd=15, alpha=0.05, type="two_sample", alternative="greater", R=1000) emp$power
This gives us a comparable power estimation.
View the code belonging to this assignment here
We begin by specifying the data
# Create variables for flight simulation data CSFI <- c(2,5,5,6,6,7,8,9) TFI <- c(1,1,2,3,3,4,5,7,7,8)
If we assume that the variances are equal, we can apply a two-sample t-test using the permutation method.^[Rizzo, p.218]
# Run set.seed(1200) pt <- permutation_test(CSFI, TFI, 999) pt
plot(pt)
t.test(CSFI, TFI, var.equal = TRUE)
The results lead to the same conclusion, although the p-value obtained using the permutation test is r round((0.132 - 0.1123)/(0.132)* 100)
\% higher.
We can use the R function runif()
to draw values from the uniform distribution.
set.seed(100) # This draws 5 values from a uniform with a=0 and b=1 runif(5)
If we wanted $80$\% of our rows randomly sampled, then we subset the probabilities generated by the uniform distribution such that they are below $0.8$. This works because, we know that, if a random variable $X$ has a uniform density with $a, b \in \mathbb{R} \implies P\left[X < a + P(b - a)\right] = P$.
# Sample from the density from 0 to 1 library(ggplot2) library(gridExtra) x <- seq(0, 1, 0.1) df <- data.frame(x=x, dens=dunif(x, 0, 1), dist=punif(x,0,1), rand = runif(length(x), 0, 1)) p1 <- ggplot(df, aes(x=x, y=dens)) + geom_line() + ggtitle("uniform density") p2 <- ggplot(df, aes(x=x, y=dist)) + geom_line() + ggtitle("uniform distribution") p3 <- ggplot(df, aes(x=x, y=rand)) + geom_point() + ggtitle("random uniform distribution values") + geom_hline(yintercept = 0.25, col = "red", linetype = 3) + geom_hline(yintercept = 0.5, col = "red", linetype = 3) + geom_hline(yintercept = 0.75, col = "red", linetype = 3) grid.arrange(p1, p2,p3, ncol=3)
In the third plot above, we see that approximately 50\% of the data lies below the median and 50\% lies above it. Hence, if we wanted to sample 80\% of our data, we would expect that proportion to be correct on average due to the randomness of the draws.
In the permutation_test()
function, we can specify whether we want to sample this way using the use_sample()
flag. As shown above, there is always a chance that we will draw a sample that is few in number of observations. To make sure that this does not happen, we impose the rule that the proportion of observations may not be above or below one standard error away from the proportion. If this happens, then the function calls itself recursively until the proportion is within acceptable bounds. The bounds may be increased or decreased using the tolerance
parameter; the smaller this parameter is, the more we compress the 'acceptable' region around the proportion.
set.seed(888) pt <- permutation_test(CSFI, TFI, 999, use_sample = FALSE, tolerance = 1) pt
set.seed(888) pt <- permutation_test(CSFI, TFI, 999, use_sample = FALSE, tolerance = 0.5) pt
set.seed(888) pt <- permutation_test(CSFI, TFI, 999, use_sample = TRUE) pt
We may observe different results using these different sampling procedures. This is because the group sizes may vary quite a bit. Moreover, by decreasing the tolerance value, we are resampling many more times because we are discarding many samples. I am not entirely sure how this works with the set.seed()
function, but testing shows that the results can vary quite a bit. Clearly, this is quite a roundabout way of trying to get rid of the sample()
function.
Another approach to sampling without using the sample()
function is to draw values from the binomial distribution.
View the code belonging to this assignment here
I would use bootstrapping to do this. Using bootstrapping, we can draw $R$ samples with replacement from the original sample to construct an empirical sampling distribution. In this case, the quantity $\theta$ of interest is the t-statistic.
set.seed(4500) res <- boot_ttest(CSFI, TFI, R=2000) res
The boot_ttest()
function returns a variety of statistics:
t.test(CSFI, TFI, var.equal = TRUE)
plot(res)
In summary, the above diagnostics show that there is a lot of uncertainty in our point estimate. This is not surprising, given that our sample size is very small.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.