The a previous lab we introduced the two-group independent $t$-test as a method for comparing the means of two groups. In some settings, it is useful to compare the means across more than two groups. The methodology behind a two-group independent $t$-test can be generalized to a procedure called analysis of variance (ANOVA). Assessing whether the means across several groups are equal by conducting a single hypothesis test rather than multiple two-sample tests is important for controlling the overall Type I error rate.
The material in this lab corresponds to Section 7.5 of OpenIntro Statistics.
Is change in non-dominant arm strength after resistance training associated with genotype?
In the Functional polymorphisms Associated with Human Muscle Size and Strength study (FAMuSS), researchers examined the relationship between muscle strength and genotype at a particular location on the ACTN3 gene. The famuss
dataset loaded below contains a subset of data from the study.
The percent change in non-dominant arm strength, comparing strength after resistance training to before training, is stored as ndrm.ch
. There are three possible genotypes (CC, CT, TT) at the r577x position on the ACTN3 gene; genotype is stored as actn3.r577x
.
#load the data load(url('https://github.com/jbryer/DATA606/blob/master/data/famuss.rda?raw=true')) #create plot
#check assumptions
a) Let the parameters $\mu_{CC}$, $\mu_{CT}$, and $\mu_{TT}$ represent the population mean change in non-dominant arm strength for individuals of the corresponding genotype. State the null and alternative hypotheses.
b) Use summary(aov())
to compute the $F$-statistic and $p$-value. Interpret the $p$-value.
#use summary(aov())
c) Complete the analysis using pairwise comparisons.
i. What is the appropriate significance level $\alpha^{\star}$ for the individual comparisons, as per the Bonferroni correction?
[\alpha^{\star} = \alpha/K, \text{where } K = \frac{k(k-1)}{2} \text{for $k$ groups}]
#use R as a calculator (note you need to chage eval = TRUE once you assign the variables below) alpha = k = K = (k*(k-1))/2 alpha.star = alpha/k alpha.star
ii. Use pairwise.t.test()
to conduct the pairwise two-sample $t$-tests.
#The pairwise.t.test() command uses a comma between the two variables x and y, # instead of a tilde like the aov() command. pairwise.t.test(y, x, p.adj = "") # NOTE: change eval = TRUE once you have this working #use pairwise.t.test() without adjusting p-value pairwise.t.test( , , p.adj = "none") #alternatively, use pairwise.t.test() with bonferroni adjustment pairwise.t.test( , , p.adj = "bonf")
iii. Summarize the results.
Is body mass index (BMI) associated with educational attainment?
This section uses data from the National Health and Nutrition Examination Survey (NHANES), a survey conducted annually by the US Centers for Disease Control (CDC).^[The dataset was first introduced in Chapter 1, Lab 1 (Introduction to Data).] The dataset nhanes.samp.adult.500
contains data for 500 participants ages 21 years or older that were randomly sampled from the complete NHANES dataset that contains 10,000 observations.
The variable BMI
contains BMI information for the study participants. The variable Education
records the highest level of education obtained: 8$^{th}$ grade, 9$^{th}$ - 11$^{th}$ grade, high school, some college, or college degree.
#load the data load(url('https://github.com/jbryer/DATA606/blob/master/data/nhanes_samp_adult_500.rda?raw=true')) #create a plot
#check assumptions
#conduct hypothesis test
Chicken farming is a multi-billion dollar industry, and any methods that increase the growth rate of young chicks can reduce consumer costs while increasing company profits. An experiment was conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chicks. Newly hatched chicks were randomly allocated into groups, and each group was given a different feed supplement.
The chickwts
dataset available in the datasets
package contains the weight in grams of chicks at six weeks of age. For simplicity, this analysis will be limited to four types of feed supplements: linseed, meatmeal, soybean, and sunflower.
chickwts
dataset and subset the data for the four feed supplements of interest.#load the data library(datasets) data("chickwts") #subset the four feed supplements keep = (chickwts$feed == "linseed" | chickwts$feed == "meatmeal" | chickwts$feed == "soybean" | chickwts$feed == "sunflower") chickwts = chickwts[keep, ] #eliminate unused levels chickwts$feed <- droplevels(chickwts$feed)
#check assumptions #conduct analysis
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.