In fhernanb/stests: Package with useful statistical tests

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "##"
)

1) Introduction

In this vignette we explain how to use the functions available in the stests package to hypothesis testing. Additionally, we compare our functions with the built-in functions t.test and var.test from stats package.

The functions covered in this vignette are:

z_test: function to perform z-test using values.
z.test: function to perform z-test using vectors.
t_test: function to perform t-test using values.
t.test: built-in R function from stats package to perform t-test using vectors.
var_test: function to perform $\chi^2$-test or F-test using values.
var.test: fuction that generalizes the var.test function from stats package. This function uses vectors.
print.htest: generic function to print objects with htest class.
plot.htest: generic function to plot objects with htest class.

2) `stests` package

Any user can download the stests package from GitHub using the next code:

if (!require('devtools')) install.packages('devtools')
devtools::install_github('fhernanb/stests', force=TRUE)

Next, the package must be loaded into the current session using:

library(stests)

2.1) z-test for population mean $\mu$

The objective in this type of problems is to test $H_0: \mu = \mu_0$ against:

$H_a: \mu > \mu_0$,
$H_a: \mu < \mu_0$,
$H_a: \mu \neq \mu_0$,

using the information of a random sample $X_1, X_2, \ldots, X_n$ from a Normal population with known variance $\sigma^2$.

2.1.1) z-test using `z_test` function

The z_test function can be used for students or instructors to solve textbook's problems in which the information is summarized with values. In the next examples we show how to use this function and it's utility.

Example 10.3 (Walpole, 2012)

A random sample of 100 recorded deaths in the United States during the past year showed an average life span of 71.8 years. Assuming a population standard deviation of 8.9 years, does this seem to indicate that the mean life span today is greater than 70 years? Use a 0.05 level of significance.

In this example we have: $n=100$, $\bar{x}=71.8$, $\sigma^2=79.21$. The objective is to test the next hypothesis:

$H_0: \mu = 70$ vs $H_a: \mu > 70$

Where $\mu$ representes the True average life span today (years).

The code to perform the test with $\alpha=0.05$ significance is:

test1 <- z_test(meanx=71.8, nx=100, sigma2=79.21, mu=70, alternative='greater')
test1

We can observe that the output provides a p-value and a confidence interval for the population mean that will help the user in the decision making of the hypothesis test.

If we want to depict the p-value for the test, we could save the result in an object and then use the plot.htest function as follows:

plot(test1, shade.col='firebrick1', col='firebrick1')

2.1.2) z-test using `z.test` function

The z.test function can be used to solve problems in which we have the raw data as a vector. In the next examples we show how to use this function.

Example 8.8 (Devore, 2016)

A dynamic cone penetrometer (DCP) is used for measuring material resistance to penetration (mm/blow) as a cone is driven into pavement or subgrade. Suppose that for a particular application it is required that the true average DCP value for a certain type of pavement be less than 30. The pavement will not be used unless there is conclusive evidence that the specification has been met. Let's state and test the appropriate hypotheses using the following data:

14.1, 14.5, 15.5, 16.0, 16.0, 16.7, 16.9, 17.1, 17.5, 17.8, 17.8, 18.1, 18.2, 18.3, 18.3, 19.0, 19.2, 19.4, 20.0, 20.0, 20.8, 20.8, 21.0, 21.5, 23.5, 27.5, 27.5, 28.0, 28.3, 30.0, 30.0, 31.6, 31.7, 31.7, 32.5, 33.5, 33.9, 35.0, 35.0, 35.0, 36.7, 40.0, 40.0, 41.3, 41.7, 47.5, 50.0, 51.0, 51.8, 54.4, 55.0, 57.0

In this example we have: $n=52$. The objective is to test the next hypothesis:

$H_0: \mu = 30$ vs $H_a: \mu < 30$

where $\mu$ represents the True average DCP value for a certain type of pavement.

In this example we don't know if the random sample comes from a normal population, but as $n=52$ and $\sigma^2$ is unkown, we can use the z-test with $\bar{x}=28.76$, $s^2=150.422$.

The code to perform the test with $\alpha=0.05$ significance is:

x<-c(14.1, 14.5, 15.5, 16.0, 16.0, 16.7, 16.9, 17.1, 17.5, 17.8, 17.8, 
     18.1, 18.2, 18.3, 18.3, 19.0, 19.2, 19.4, 20.0, 20.0, 20.8, 20.8, 
     21.0,21.5, 23.5, 27.5, 27.5, 28.0, 28.3, 30.0, 30.0, 31.6, 31.7, 
     31.7, 32.5, 33.5, 33.9, 35.0, 35.0, 35.0, 36.7, 40.0, 40.0, 41.3, 
     41.7, 47.5, 50.0, 51.0, 51.8, 54.4, 55.0, 57.0)
test2 <- z.test(x=x, sigma2=var(x), mu=30, alternative='less')
test2
plot(test2, shade.col='firebrick1', col='firebrick1')

2.2) t-test for population mean $\mu$ and for the difference in means $\mu_1-\mu_2$

The objective in this type of problems is to test $H_0: \mu = \mu_0$ against:

$H_a: \mu > \mu_0$,
$H_a: \mu < \mu_0$,
$H_a: \mu \neq \mu_0$,

using the information of a random sample $X_1, X_2, \ldots, X_n$ from a Normal population with unknown variance $\sigma^2$.

Also, we can test $H_0: \mu_x-\mu_y = \delta_0$ against:

$H_a: \mu_x-\mu_y > \delta_0$,
$H_a: \mu_x-\mu_y < \delta_0$,
$H_a: \mu_x-\mu_y \neq \delta_0$,

using the information of two random samples $X_1, X_2, \ldots, X_n$ and $Y_1, Y_2, \ldots, Y_n$, both from populations with normal distributions and with unknown variances $\sigma_1^2$,$\sigma_2^2$.

2.2.1) t-test using `t_test` function

The t_test function can be used for students or instructors to solve textbook's problems in which the information is summarized with values. In the next examples we show how to use this function and it's utility.

Excercise 10.29 (Walpole, 2012): test on a single sample

Past experience indicates that the time required for high school seniors to complete a standardized test is a normal random variable with a mean of 35 minutes. If a random sample of 20 high school seniors took an average of 33.1 minutes to complete this test with a standard deviation of 4.3 minutes, test the hypothesis, at the 0.05 level of significance, that $\mu=35$ minutes against the alternative that $\mu<35$ minutes.

In this example we have: $n=20$, $\bar{x}=33.1$, $s^2=18.49$. The objective is to test the next hypothesis:

$H_0: \mu = 35$ vs $H_a: \mu < 35$

$\mu=$ True average time required for high school seniors to complete a standardized test (minutes)

The code to perform the test with $\alpha=0.05$ significance is:

test3 <- t_test(meanx=33.1, varx=18.49, nx=20, mu=35, alternative='less')
test3
plot(test3, shade.col='firebrick1', col='firebrick1')

Example 10.6 (Walpole, 2012): test on two samples with unknown but equal variances

An experiment was performed to compare the abrasive wear of two different laminated materials. Twelve pieces of material 1 were tested by exposing each piece to a machine measuring wear. Ten pieces of material 2 were similarly tested. In each case, the depth of wear was observed. The samples of material 1 gave an average (coded) wear of 85 units with a sample standard deviation of 4, while the samples of material 2 gave an average of 81 with a sample standard deviation of 5. Can we conclude at the 0.05 level of significance that the abrasive wear of material 1 exceeds that of material 2 by more than 2 units? Assume the populations to be approximately normal with equal variances.

In this example we have: $n_1=12$, $\bar{x}_1=85$, $s_1^2=16$, $n_2=10$, $\bar{x}_2=81$, $s_2^2=25$. The objective is to test the next hypothesis:

$H_0: \mu_1-\mu_2 = 2$ vs $H_a: \mu_1-\mu_2 > 2$

$\mu_1=$ True average abrasive wear of material 1 (units) $\mu_2=$ True average abrasive wear of material 2 (units)

The code to perform the test with $\alpha=0.05$ significance is:

test4 <- t_test(meanx=85, varx=16, nx=12,
                meany=81, vary=25, ny=10,
                alternative='greater', var.equal=TRUE, mu=2)
test4
plot(test4, shade.col='firebrick1', col='firebrick1')

2.2.2) t-test using `t.test` function

The t.test function can be used to solve problems in which we have the raw data as a vector. In the next examples we show how to use this function.

Example 8.9 (Devore, 2016)

Glycerol is a major by-product of ethanol fermentation in wine production and contributes to the sweetness, body, and fullness of wines. The article "A Rapid and Simple Method for Simultaneous Determination of Glycerol, Fructose, and Glucose in Wine" (American J. of Enology and Viticulture, 2007: 279-283) includes the following observations on glycerol concentration (mg/mL) for samples of standard-quality (uncertified) white wines:

2.67, 4.62, 4.14, 3.81, 3.83

Does the sample data suggest that true average concentration is something other than the desired value? (Assume that the population distribution of glycerol concentration is normal)

In this example we have: $n=5$. The objective is to test the next hypothesis:

$H_0: \mu = 4$ vs $H_a: \mu \neq 4$

$\mu=$ True average glycerol concentration (mg/mL)

The code to perform the test with $\alpha=0.05$ significance is:

x <- c(2.67, 4.62, 4.14, 3.81, 3.83)
test5 <- t.test(x=x, sigma2=var(x), mu=4, alternative='two.sided')
test5
plot(test5, shade.col='firebrick1', col='firebrick1')

2.3) $\chi^2$-test for population variance $\sigma^2$

The objective in this type of problems is to test $H_0: \sigma^2 = \sigma^2_0$ against:

$H_a: \sigma^2 > \sigma_0^2$,
$H_a: \sigma^2 < \sigma_0^2$,
$H_a: \sigma^2 \neq \sigma_0^2$,

using the information of a random sample $X_1, X_2, \ldots, X_n$ from a population with normal distribution.

2.3.1) $\chi^2$-test using `var_test` function

The var_test function can be used to solve textbook's problems in which the information is summarized with values. In next examples we show the utility of this function.

Example 7.7.1 (Wayne & Chad, 2013)

The purpose of a study by Wilkins et al. (A-28) was to measure the effectiveness of recombinant human growth hormone (rhGH) on children with total body surface area burns > 40 percent. In this study, 16 subjects received daily injections at home of rhGH. At baseline, the researchers wanted to know the current levels of insulin-like growth factor (IGF-I) prior to administration of rhGH. The sample variance of IGF-I levels (in ng/ml) was 670.81. We wish to know if we may conclude from these data that the population variance is not 600.

In this example we have: $n=16$, $s^2=670.81$. The objective is to test the next hypothesis:

$H_0: \sigma^2 = 600$ vs $H_a: \sigma^2 \neq 600$

$\sigma^2=$ True population variance of IGF-I levels

The code to perform the test with $\alpha=0.05$ significance is:

test6 <- var_test(varx=670.81, nx=16, null.value=600, alternative='two.sided')
test6
plot(test6, shade.col='firebrick1', col='firebrick1')

2.3.2) $\chi^2$-test using `var.test` function

The var.test function can be used to solve problems in which we have the raw data as a vector. In the next examples we show how to use this function.

Exercise 10.67 (Walpole, 2012)

The content of containers of a particular lubricant is known to be normally distributed with a variance of 0.03 liter. Test the hypothesis that $\sigma^2 = 0.03$ against the alternative that $\sigma^2 \neq 0.03$ for the random sample of 10 containers:

10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, 9.8

In this example we have: $n=10$. The objective is to test the next hypothesis:

$H_0: \sigma^2 = 0.03$ vs $H_a: \sigma^2 \neq 0.03$

$\sigma^2=$ True population variance of container content of a particular lubricant

The code to perform the test with $\alpha=0.05$ significance is:

x <- c(10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, 9.8)
test7 <- var.test(x=x, null.value=0.03, alternative='two.sided')
test7
plot(test7, shade.col='firebrick1', col='firebrick1')

2.4) F-test for equality of variances

The objective in this type of problems is to test $H_0: \sigma_1^2 = \sigma_2^2$ against:

$H_a: \sigma_x^2 > \sigma_y^2$,
$H_a: \sigma_x^2 < \sigma_y^2$,
$H_a: \sigma_x^2 \neq \sigma_y^2$,

using the information of two random samples $X_1, X_2, \ldots, X_n$ and $Y_1, Y_2, \ldots, Y_n$, both from populations with normal distributions and with unknown variances $\sigma_x^2$,$\sigma_y^2$.

2.4.1) F-test using `var_test` function

The var_test function can be used for students or instructors to solve textbook's problems in which the information is summarized with values. In the next examples we show how to use this function and it's utility.

Example 9.14 (Devore, 2016)

On the basis of data reported in the article "Serum Ferritin in an Elderly Population" (J. of Gerontology, 1979: 521-524), the authors concluded that the ferritin distribution in the elderly had a smaller variance than in the younger adults. (Serum ferritin is used in diagnosing iron deficiency.) For a sample of 28 elderly men, the sample standard deviation of serum ferritin (mg/L) was 52.6; for 26 young men, the sample standard deviation was 84.2. Does this data support the conclusion as applied to men?

In this example we have: $n_x=28$, $s_x^2=2766.76$, $n_y=26$, $s_y^2=7089.64$. The objective is to test the next hypothesis:

$H_0: \sigma_x^2 = \sigma_y^2$ vs $H_a: \sigma_x^2 < \sigma_y^2$

$\sigma_x^2=$ True population variance of serum ferritin in elderly men $\sigma_y^2=$ True population variance of serum ferritin in young men

The code to perform the test with $\alpha=0.01$ significance is:

test8 <- var_test(varx=2766.76, nx=28, vary=7089.64, ny=26, 
                  null.value = 1, alternative='less',conf.level=0.99)
test8
plot(test8, shade.col='firebrick1', col='firebrick1')

2.4.2) F-test using `var.test` function

The var.test function can be used to solve problems in which we have the raw data as a vector. In the next examples we show how to use this function.

Exercise 10.77 (Walpole, 2012)

An experiment was conducted to compare the alcohol content of soy sauce on two different production lines. Production was monitored eight times a day.The data are shown here:

Production line 1: 0.48, 0.39, 0.42, 0.52, 0.40, 0.48, 0.52, 0.52

Production line 2: 0.38, 0.37, 0.39, 0.41, 0.38, 0.39, 0.40, 0.39

Assume both populations are normal. It is suspected that production line 1 is not producing as consistently as production line 2 in terms of alcohol content. Test the hypothesis that $\sigma_1^2 = \sigma_2^2$ against the alternative that $\sigma_1^2 \neq \sigma_2^2$.

In this example we have: $n_1=8$, $n_2=8$. The objective is to test the next hypothesis:

$H_0: \sigma_1^2 = \sigma_2^2$ vs $H_a: \sigma_1^2 \neq \sigma_2^2$

$\sigma_1^2=$ True population variance of alcohol content in Production line 1 $\sigma_2^2=$ True population variance of alcohol content in Production line 2

The code to perform the test with $\alpha=0.05$ significance is:

line1 <- c(0.48, 0.39, 0.42, 0.52, 0.40, 0.48, 0.52, 0.52)
line2 <- c(0.38, 0.37, 0.39, 0.41, 0.38, 0.39, 0.40, 0.39)
test9 <- var.test(x=line1, y=line2, null.value = 1, alternative='two.sided')
test9
plot(test9, shade.col='firebrick1', col='firebrick1')

3) References

Devore, J. L. (2016), Probability and Statistics for engineering and the sciences, Cengage.
Walpole, R. E., Myers, R. H., Myers, S. L. & Ye, K. (2012), Probability & Statistics for Engineers & Scientists, Prentice Hall.
Wayne, D. & Chad, C. (2013), BIOSTATISTICS: A Foundation for Analysis in the Health Sciences, Wiley.

fhernanb/stests documentation built on July 4, 2025, 1:15 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

fhernanb/stests
Package with useful statistical tests

In fhernanb/stests: Package with useful statistical tests

1) Introduction

2) `stests` package

2.1) z-test for population mean $\mu$

2.1.1) z-test using `z_test` function

Example 10.3 (Walpole, 2012)

2.1.2) z-test using `z.test` function

Example 8.8 (Devore, 2016)

2.2) t-test for population mean $\mu$ and for the difference in means $\mu_1-\mu_2$

2.2.1) t-test using `t_test` function

Excercise 10.29 (Walpole, 2012): test on a single sample

Example 10.6 (Walpole, 2012): test on two samples with unknown but equal variances

2.2.2) t-test using `t.test` function

Example 8.9 (Devore, 2016)

2.3) $\chi^2$-test for population variance $\sigma^2$

2.3.1) $\chi^2$-test using `var_test` function

Example 7.7.1 (Wayne & Chad, 2013)

2.3.2) $\chi^2$-test using `var.test` function

Exercise 10.67 (Walpole, 2012)

2.4) F-test for equality of variances

2.4.1) F-test using `var_test` function

Example 9.14 (Devore, 2016)

2.4.2) F-test using `var.test` function

Exercise 10.77 (Walpole, 2012)

3) References

R Package Documentation

Browse R Packages

We want your feedback!

fhernanb/stests Package with useful statistical tests

In fhernanb/stests: Package with useful statistical tests

1) Introduction

2) stests package

2.1) z-test for population mean $\mu$

2.1.1) z-test using z_test function

Example 10.3 (Walpole, 2012)

2.1.2) z-test using z.test function

Example 8.8 (Devore, 2016)

2.2) t-test for population mean $\mu$ and for the difference in means $\mu_1-\mu_2$

2.2.1) t-test using t_test function

Excercise 10.29 (Walpole, 2012): test on a single sample

Example 10.6 (Walpole, 2012): test on two samples with unknown but equal variances

2.2.2) t-test using t.test function

Example 8.9 (Devore, 2016)

2.3) $\chi^2$-test for population variance $\sigma^2$

2.3.1) $\chi^2$-test using var_test function

Example 7.7.1 (Wayne & Chad, 2013)

2.3.2) $\chi^2$-test using var.test function

Exercise 10.67 (Walpole, 2012)

2.4) F-test for equality of variances

2.4.1) F-test using var_test function

Example 9.14 (Devore, 2016)

2.4.2) F-test using var.test function

Exercise 10.77 (Walpole, 2012)

3) References

R Package Documentation

Browse R Packages

We want your feedback!

fhernanb/stests
Package with useful statistical tests

2) `stests` package

2.1.1) z-test using `z_test` function

2.1.2) z-test using `z.test` function

2.2.1) t-test using `t_test` function

2.2.2) t-test using `t.test` function

2.3.1) $\chi^2$-test using `var_test` function

2.3.2) $\chi^2$-test using `var.test` function

2.4.1) F-test using `var_test` function

2.4.2) F-test using `var.test` function