adjusted.predictors: Calculating the Adjusted Group Means of Individual-Level...
In MicroMacroMultilevel: Micro-Macro Multilevel Modeling

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/adjustedpredictors2.r

As the prerequisite step of fitting a micro-macro multilevel model, this function calculates the adjusted group means of individual-level predictors in an unbiased way.

1	adjusted.predictors(x.data, z.data, x.gid, z.gid)

`x.data`	an N-by-p data frame of individual-level predictors, where N denotes the total number of individuals and p denotes the number of individual-level predictors. Must contain no NAs.
`z.data`	a G-by-q data frame of group-level predictors, where G denotes the total number of groups and q denotes the number of group-level predictors. Must contain no NAs.
`x.gid`	an array or an N-by-1 numeric matrix of each individual's group ID. The order corresponds to the individuals in x.data. Duplicates expected.
`z.gid`	an array or a G-by-1 numeric matrix of Group ID. The order corresponds to the groups in z.data. All group IDs should be unique (i.e., no duplicates allowed).

To date, most multilevel methodologies can only unbiasedly model macro-micro multilevel situations, wherein group-level predictors (e.g., city temperature) are used to predict an individual-level outcome variable (e.g., citizen personality). In contrast, this R package enables researchers to model micro-macro situations, wherein individual-level (micro) predictors (and other group-level predictors) are used to predict a group-level (macro) outcome variable in an unbiased way.

To conduct micro-macro multilevel modeling with the current package, one must first compute the adjusted group means with the function adjusted.predictors. This is because in micro-macro multilevel modeling, it is statistically biased to directly regress the group-level outcome variable on the unadjusted group means of individual-level predictors (Croon & van Veldhoven, 2007). Instead, one should use the best linear unbiased predictors (BLUP) of the group means (i.e., the adjusted group means), which is conveniently computed by adjusted.predictors.

Once produced by adjusted.predictors, the adjusted group means can be used as one of the inputs of the micromacro.lm function, which reports estimation results and inferential statistics of the micro-macro multilevel model of interest. Importantly, adjusted.predictors also reports whether group size is the same across all groups, which is a critical dummy input of the micromacro.lm function.

adjusted.group.means a G-by-(p+q+1) numeric matrix that contains p adjusted group means of the individual-level variables from x.data, q group-level predictors from z.data, and unique group IDs.

unequal.groups a boolean variable. TRUE = group size is different across groups; FALSE = group size is the same across groups.

group.size a G-by-2 data frame that displays unique group IDs and the corresponding group sizes.

Jackson G. Lu, Elizabeth Page-Gould, Nancy R. Xu (maintainer, nancyranxu@gmail.com).

Akinola, M., Page-Gould, E., Mehta, P. H., & Lu, J. G. (2016). Collective hormonal profiles predict group performance. Proceedings of the National Academy of Sciences, 113 (35), 9774-9779.

Croon, M. A., & van Veldhoven, M. J. (2007). Predicting group-level outcome variables from variables measured at the individual level: A latent variable multilevel model. Psychological Methods, 12(1), 45-57.

micromacro.lm for fitting the micro-macro multilevel linear model of interest.

######## SETUP: DATA GENERATING PROCESSES ########
set.seed(123)
# Step 1. Generate a G-by-q data frame of group-level predictors (e.g., control variables), z.data
# In this example, G = 40, q = 2
group.id = seq(1, 40)
z.var1 = rnorm(40, mean=0, sd=1)
z.var2 = rnorm(40, mean=100, sd=2)
z.data = data.frame(group.id, z.var1, z.var2)
# Step 2. Generate a G-by-p data frame of group-level means for the predictors that will be used to
# generate x.data
# In this example, there are 3 individual-level predictors, thus p = 3
x.var1.means = rnorm(40, mean=50, sd = .05)
x.var2.means = rnorm(40, mean=20, sd = .05)
x.var3.means = rnorm(40, mean=-10, sd = .05)
x.data.means = data.frame(group.id, x.var1.means, x.var2.means, x.var3.means)
# Step 3. Generate two N-by-p data frames of individual-level predictors, x.data
# One of these two data frames assumes unequal-sized groups (Step 3a),
# whereas the other assumes equal-sized groups (Step 3b):
# Step 3a. Generate the individual-level predictors
# In this example, N = 200 and group size is unequal
x.data.unequal = data.frame( group.id=rep(1:40, times=sample( c(4,5,6), 40, replace=TRUE) )[1:200] )
x.data.unequal = merge( x.data.unequal,
              data.frame( group.id, x.var1.means, x.var2.means, x.var3.means ), by="group.id" )
x.data.unequal = within( x.data.unequal, {
  x.var1 = x.var1.means + rnorm(200, mean=0, sd = 2)
  x.var2 = x.var2.means + rnorm(200, mean=0, sd = 6)
  x.var3 = x.var3.means + rnorm(200, mean=0, sd = 1.5)
})
# Step 3b. Generate the individual-level predictors
# In this example, N = 200 and group size is equal
x.data.equal = data.frame( group.id=rep(1:40, each=5) )
x.data.equal = merge( x.data.equal, x.data.means, by="group.id" )
x.data.equal = within( x.data.equal, {
  x.var1 = x.var1.means + rnorm(200, mean=0, sd = 2)
  x.var2 = x.var2.means + rnorm(200, mean=0, sd = 6)
  x.var3 = x.var3.means + rnorm(200, mean=0, sd = 1.5)
})
# Step 3. Generate a G-by-1 data frame of group-level outcome variable, y
# In this example, G = 40
y = rnorm(40, mean=6, sd=5)

apply(x.data.equal,2,mean)
#    group.id x.var1.means x.var2.means x.var3.means       x.var3       x.var2       x.var1
# 20.500000    50.000393    19.994708    -9.999167   -10.031995    20.185361    50.084635
apply(x.data.unequal,2,mean)
#    group.id x.var1.means x.var2.means x.var3.means       x.var3       x.var2       x.var1
# 20.460000    50.002286    19.994605    -9.997034    -9.983146    19.986111    50.123591
apply(z.data,2,mean)
# z.var1      z.var2
# 0.04518332 99.98656817
mean(y)
# 6.457797

######## EXAMPLE 1. GROUP SIZE IS DIFFERENT ACROSS GROUPS ########
######## Need to use adjusted.predictors() in the same package ###

# Step 4a. Generate a G-by-1 matrix of group ID, z.gid. Then generate an N-by-1 matrix of
# each individual's group ID, x.gid, where the group sizes are different
z.gid = seq(1:40)
x.gid = x.data.unequal$group.id
# Step 5a. Generate the best linear unbiased predictors that are calcualted from
# individual-level data
x.data = x.data.unequal[,c("x.var1","x.var2","x.var3")]
results = adjusted.predictors(x.data, z.data, x.gid, z.gid)
# Note: Given the fixed random seed, the output should be as below
results$unequal.groups
# TRUE
names(results$adjusted.group.means)
# "BLUP.x.var1" "BLUP.x.var2" "BLUP.x.var3" "z.var1"      "z.var2"      "gid"
head(results$adjusted.group.means)
#   BLUP.x.var1 BLUP.x.var2 BLUP.x.var3 group.id      z.var1    z.var2 gid
# 1    50.05308    20.83911  -10.700361        1 -0.56047565  98.61059   1
# 2    48.85559    22.97411   -9.957270        2 -0.23017749  99.58417   2
# 3    50.16357    19.50001   -9.645735        3  1.55870831  97.46921   3
# 4    49.61853    21.25962  -10.459398        4  0.07050839 104.33791   4
# 5    50.49673    21.38353   -9.789924        5  0.12928774 102.41592   5
# 6    50.86154    19.15901   -9.245675        6  1.71506499  97.75378   6

######## EXAMPLE 2. GROUP SIZE IS THE SAME ACROSS ALL GROUPS ########
######## Need to use adjusted.predictors() in the same package ###

# Step 4b. Generate a G-by-1 matrix of group ID, z.gid. Then generate an N-by-1 matrix of
# each individual's group ID, x.gid, where group size is the same across all groups
z.gid = seq(1:40)
x.gid = x.data.equal$group.id
# Step 5b. Generate the best linear unbiased predictors that are calcualted from
# individual-level data
x.data = x.data.equal[,c("x.var1","x.var2","x.var3")]
results = adjusted.predictors(x.data, z.data, x.gid, z.gid)
results$unequal.groups
# FALSE
names(results$adjusted.group.means)
# "BLUP.x.var1" "BLUP.x.var2" "BLUP.x.var3" "z.var1"      "z.var2"      "gid"
results$adjusted.group.means[1:5, ]
#   BLUP.x.var1 BLUP.x.var2 BLUP.x.var3 group.id      z.var1    z.var2 gid
# 1    50.91373    19.12994  -10.051647        1 -0.56047565  98.61059   1
# 2    50.19068    19.17978  -10.814382        2 -0.23017749  99.58417   2
# 3    50.13390    20.98893   -9.952348        3  1.55870831  97.46921   3
# 4    49.68169    19.60632  -10.612717        4  0.07050839 104.33791   4
# 5    50.28579    22.07469  -10.245505        5  0.12928774 102.41592   5

Warning messages:
1: In x.var1.means + rnorm(200, mean = 0, sd = 2) :
  longer object length is not a multiple of shorter object length
2: In x.var2.means + rnorm(200, mean = 0, sd = 6) :
  longer object length is not a multiple of shorter object length
3: In x.var3.means + rnorm(200, mean = 0, sd = 1.5) :
  longer object length is not a multiple of shorter object length
4: In `[<-.data.frame`(`*tmp*`, nl, value = list(x.var3 = c(-12.0702873063753,  :
  replacement element 1 has 200 rows to replace 194 rows
5: In `[<-.data.frame`(`*tmp*`, nl, value = list(x.var3 = c(-12.0702873063753,  :
  replacement element 2 has 200 rows to replace 194 rows
6: In `[<-.data.frame`(`*tmp*`, nl, value = list(x.var3 = c(-12.0702873063753,  :
  replacement element 3 has 200 rows to replace 194 rows
    group.id x.var1.means x.var2.means x.var3.means       x.var3       x.var2 
   20.500000    50.000393    19.994708    -9.999167    -9.985279    19.986214 
      x.var1 
   50.121698 
    group.id x.var1.means x.var2.means x.var3.means       x.var3       x.var2 
   20.118557    50.001697    19.993560    -9.997938    -9.986925    20.225848 
      x.var1 
   50.060817 
   group.id      z.var1      z.var2 
20.50000000  0.04518332 99.98656817 
[1] 6.457797
[1] TRUE
[1] "BLUP.x.var1" "BLUP.x.var2" "BLUP.x.var3" "group.id"    "z.var1"     
[6] "z.var2"      "gid"        
  BLUP.x.var1 BLUP.x.var2 BLUP.x.var3 group.id      z.var1    z.var2 gid
1    50.37630    22.25261  -10.023444        1 -0.56047565  98.61059   1
2    49.88690    22.53537  -10.033116        2 -0.23017749  99.58417   2
3    49.75327    19.53242  -11.197199        3  1.55870831  97.46921   3
4    50.16614    19.55715   -9.663839        4  0.07050839 104.33791   4
5    50.52802    20.74453   -9.646215        5  0.12928774 102.41592   5
6    49.41930    19.90466  -10.481041        6  1.71506499  97.75378   6
[1] FALSE
[1] "BLUP.x.var1" "BLUP.x.var2" "BLUP.x.var3" "group.id"    "z.var1"     
[6] "z.var2"      "gid"        
  BLUP.x.var1 BLUP.x.var2 BLUP.x.var3 group.id      z.var1    z.var2 gid
1    50.05587    20.82469  -10.730297        1 -0.56047565  98.61059   1
2    48.71397    22.85125   -9.915786        2 -0.23017749  99.58417   2
3    50.82770    20.13679  -10.164313        3  1.55870831  97.46921   3
4    49.45720    21.41725   -9.994019        4  0.07050839 104.33791   4
5    50.55675    20.26460   -9.880389        5  0.12928774 102.41592   5