Home

/

GitHub

/

McCartneyAC/edlf8360

/

README.md

README.md
In McCartneyAC/edlf8360: Functions and Data for EDLF 8360 (Multilevel Modeling)

`edlf8360`

An R package for EDLF 8360 at UVA's School of Education and Human Development (Fall 2018).

You know the drill:

devtools::install_github("McCartneyAC/edlf8360")

Over time, the package will grow to contain all datasets contained within the textbooks: Hox, Moerbeek and van de Schoot (2018) and Rabe-Hesketh and Skrondal (2012). For now, I am getting a consistent error with haven::read_sav so will need to use SPSS in the library to convert files to .csv then to .rda. Additionally, this means that the popularity data pop.dat from Hox et al. (2018) contains no header rows as that is how the data are given on their github repo.

hsb (High School and Beyond)
pop (Student Popularity)
hours (from Ziliak (1997) and consists of panel data on 532 men with annual waves from 1979 to 1988.)
Datasets from Field, Miles, and Field:
surgery
honeymoon
miller2007
hill2007
jsp from homework 1
more to come

useage:

data("hsb")

data("hours")

So far, I have found three useful functions:

Centers a variable on a given center. Should be useable within a tidy data pipe, e.g.

hsb %>%
  center(ses, mean(ses))


# or 

hsb %>%
  center(mathach, 5)

The first example above does a grand-mean center; however, often in HLM we need to group mean center. This is easily accomplished:

hsb %>% 
  group_by(schoolid) %>% 
  center(mathach, mean(mathach))

Similar to center, standardize performs a variable adjustment that can be contained within a pipe. In this case, standardize performs a z-transformation on a given variable, as such:

hsb %>% 
  select(ses, mathach) %>% 
  standardize(mathach) %>% 
  head()

# A tibble: 6 x 2
      ses mathach
    <dbl>   <dbl>
1 -1.53    -0.999
2 -0.588    1.01 
3 -0.528    1.11 
4 -0.668   -0.577
5 -0.158    0.749
6  0.0220  -1.19

Named as an analogue to stata's regress command, this function has two features. First, it allows regression to be the final step of a data pipeline, and secondly it specifies a linear model and summarizes it in a single step. It also means that the data= argument of a lm() is no longer necessary. It does this by calling summarize(lm()) while re-ordering the arguments of lm() for data= to come first. For example:

hsb %>% 
   filter(schoolid %in% c(1224, 1288, 1296)) %>% 
   center(ses, mean(ses)) %>% 
   regress(mathach~female + minority)

Call:
lm(formula = ..1, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.7125  -5.0128  -0.0476   5.2404  14.8554 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   12.453      1.069  11.653  < 2e-16 ***
female        -1.573      1.230  -1.279 0.203471    
minority      -4.137      1.219  -3.394 0.000941 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.634 on 117 degrees of freedom
Multiple R-squared:  0.1044,    Adjusted R-squared:  0.08905 
F-statistic: 6.817 on 2 and 117 DF,  p-value: 0.001584

NOTE: The structure of this means that no specified model will be saved into the global environment, so, e.g., the plot_margins function can't be used after regress.

This function also works with clustered standard errors (though it requires more packages to be downloaded to do so).

regress(hsb, formula = mathach ~ ses , cluster = "schoolid")

R^2= 0.13014 

            Estimate Std. Error  t value      Pr(>|t|)
(Intercept) 12.74740  0.1694473 75.22927  0.000000e+00
ses          3.18387  0.1334850 23.85190 9.677759e-126

Takes a pre-specified linear model and plots a margins plot using ggplot2. This is done without styling or labels, so these must be added post-hoc.

model5<-lm(mathach ~ minority + female + ses, data = hsb)
plot_margins(model5)

For a better take on this, see sjplot.

Interactions in margins plots

The interaction margins plots generated by stata can still be used, but require additional steps. There is no explicit function for these, but they can be generated using functions inside lme4 or nlme when paired with ggplot2.

hsb %>% 
  mutate(pts = fitted(model4, level = 0, asList = FALSE)) %>% 
  # fitted points become the y axis later. level = 0 indicates L1 effects. 
  filter(ses > -2) %>% 
  filter(ses < 2) %>% 
  ggplot(aes(x = ses, y = pts, color = factor(female) )) + 
  stat_summary_bin(fun.data="mean_cl_boot", geom="errorbar", bins = 10) + 
  stat_summary_bin(fun.y = "mean", geom= "point", bins = 10) +
  stat_summary_bin(fun.y = mean, geom="line", bins = 10) + 
  theme_stata() +
  scale_color_stata() + 
  labs(
    title = "Interactions of Gender and SES on Math Achievement",
    color = "Female" ,
    x = "Socio-Economic Status",
    y = "Math Achievement Fitted Points",
    caption = "Model: mathach ~ female + smn_fem + ses + smn_ses + pracad | schoolid"
  )

~coming as soon as we cover the topic~

psych provides psych::ICC which has several different implementations of ICC calculations; however, these are specifically derived for inter-rater reliability estimates and it's not clear if they'll be useful for mixed effects models because 1) which specification do we need? 2) the matrix operation means we'll need some procedure for using select, I guess, with the variables being different (not the case for inter-rater) and 3) it should be as simple as doing icc(model5) > 0.25 or something. I'm going to keep thinking about this.

SOLVED.

model2<-lmer(lnwg ~ 1 + (1 |id), data = hours)
icc(model2)
[1] 0.8250423

ess() will calculate an effective sample size when given a null model, a number of participants, and a number of clusters.

ess(model1, participants = 1000, clusters = 50)

slopes_CI <- function(estimate, variance) {

  ub<-estimate + (1.96*sqrt(variance))
  lb<-estimate - (1.96*sqrt(variance))
  print(paste("UB is", round(ub,3), "LB is", round(lb,3)))
}

slopes_CI(2.19, 0.69)

McCartneyAC/edlf8360 documentation built on May 7, 2019, 4:03 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

McCartneyAC/edlf8360
Functions and Data for EDLF 8360 (Multilevel Modeling)

README.md
In McCartneyAC/edlf8360: Functions and Data for EDLF 8360 (Multilevel Modeling)

`edlf8360`

Installation

Data

Datasets:

useage:

Functions

Center

Standardize

Regress

Margins Plots

Interactions in margins plots

Calc_ICC

SOLVED.

Effective Sample Size:

Construct a 95% prediction interval for the school slopes

R Package Documentation

Browse R Packages

We want your feedback!

McCartneyAC/edlf8360 Functions and Data for EDLF 8360 (Multilevel Modeling)

README.md In McCartneyAC/edlf8360: Functions and Data for EDLF 8360 (Multilevel Modeling)

edlf8360

Installation

Data

Datasets:

useage:

Functions

Center

Standardize

Regress

Margins Plots

Interactions in margins plots

Calc_ICC

SOLVED.

Effective Sample Size:

Construct a 95% prediction interval for the school slopes

R Package Documentation

Browse R Packages

We want your feedback!

McCartneyAC/edlf8360
Functions and Data for EDLF 8360 (Multilevel Modeling)

README.md
In McCartneyAC/edlf8360: Functions and Data for EDLF 8360 (Multilevel Modeling)

`edlf8360`