Notice the echo=FALSE, message=FALSE here. If you want to hide your R code you can do that by adding those inside the {}. We are going to load some packages we need for this lab.

library('lehmansociology')
library('ggplot2')
library(scales)
library(grid)
library(lsr)
library(xtable) 
library(dplyr)

We are going to be working with the addhealth data so let's attach it.

attach(addhealth)

We are going to start by working with the variable tvhrs.

Remember that we have looked at tvhrs before. Let's look at the mean and standard deviation and a histogram.

mean(tvhrs)
sd(tvhrs)

Now, we can ask R to show us the mean of tvhrs for different raceeth groups and histograms of tvhrs for each group. Try to notice the differences between the ggplot code here and the ggplot code that you've used before.

tvhrs_raceeth <- addhealth %>% 
  group_by(raceeth) %>% 
  summarize(mean(tvhrs), median(tvhrs), IQR(tvhrs), sd(tvhrs))
tvhrs_raceeth

ggplot(addhealth, aes(tvhrs)) +
  geom_histogram(aes(y=..density..),binwidth=1, fill="black") +
  facet_wrap(~raceeth) +
  scale_y_continuous(labels=percent) +
  xlab("") +
  ylab("") +
  ggtitle("Figure #")

Now let's ask R for some confidence interval around the mean for tvhrs.

ciMean(tvhrs, conf =0.90)
ciMean(tvhrs, conf =0.95)
ciMean(tvhrs, conf=0.99)

Write a sentence(using the results that you get) to interpret one of these three confidence intervals.

What is the difference between these three confidence intervals? What happens when you change the level of confidence from 90% to 99%?


Let's build upon that and get confidence intervals around the mean for each race/ethnic group. Although it is not clearly labeled as such, this provides a 95% confidence interval.

````r

aggregate(tvhrs, by=list(raceeth), ciMean)

```

Compare the confidence intervals for the 6 race/ethnic groups. Write a few sentences that compare/contrast and INTERPRET the most interesting race/ethnic differences in the confidence intervals of the mean of tvhrs.

Which race/ethnic group has the widest confidence interval around the mean? the narrowest?

Looking back at the histogram of tvhrs by race/ethnic groups, which group would you expect to have the widest confidence interval? the narrowest?

Are you answers to the two above questions different? Why or why not?



elinw/lehmansociology documentation built on May 16, 2019, 3 a.m.