Description Usage Arguments Details Value Note Author(s) References Examples

Given an assignment variable, variables on which to compare groups assigned to treatment and control conditions and, optionally, a clustering variable and/or one or more stratifying factors, compares univariate and multivariate measures suitable for comparing the two groups and for testing the proposition that assignment was random or effectively random within levels of a stratifying factor.

1 2 3 4 5 | ```
balanceTest(fmla, data, strata = NULL, report = c("std.diffs", "z.scores",
"adj.means", "adj.mean.diffs", "chisquare.test", "p.values", "all")[1:2],
element.weights, stratum.weights = harmonic, subset,
include.NA.flags = TRUE, covariate.scaling = NULL,
post.alignment.transform = NULL, p.adjust.method = "holm")
``` |

`fmla` |
A formula containing an indicator of treatment assignment on the left hand side and covariates at right. |

`data` |
A data frame in which |

`strata` |
A list of right-hand-side-only formulas containing
the factor(s) identifying the strata, with |

`report` |
Character vector listing measures to report for each
stratification; a subset of |

`element.weights` |
Per-element weight, or 0 if element does not meet condition specified by subset argument. If there are clusters, the cluster weight is the sum of weights of elements within the cluster. Within each stratum, cluster and element weights will be normalized to sum to 1. |

`stratum.weights` |
Weights to be applied when aggregating
across strata specified by |

`subset` |
Optional; condition or vector specifying a subset of observations to be given positive element weights. |

`include.NA.flags` |
Present item missingness comparisons as well as covariates themselves? |

`covariate.scaling` |
A scale factor to apply to covariates in
calculating |

`post.alignment.transform` |
Optional transformation applied to covariates just after their stratum means are subtracted off. |

`p.adjust.method` |
Method of p-value adjustment. |

The function assembles various univariate descriptive statistics
for the groups to be compared: (weighted) means of treatment and
control groups; differences of these (adjusted differences); and
adjusted differences as multiples of a pooled S.D. of the variable
in the treatment and control groups (standard differences). This
is done separately for each provided stratifying factor and, by
default, for the unstratified comparison, in each case reflecting
a standardization appropriate to the designated (post-)
stratification of the sample. In the case without stratification
or clustering, the only weighting used to calculate treatment and
control group means is that provided by the user as an
`element.weight`

; in the absence of such an argument, these
means are unweighted. When there are strata, within-stratum means
of treatment or of control observations are calculated using
`element.weights`

, if provided, and then these are combined
across strata according to a ‘effect of treatment on
treated’-type weighting scheme. (The function's
`stratum.weights`

argument figures in the function's
inferential calculations but not these descriptive calculations.)
To figure a stratum's effect of treatment on treated weight, the
sum of all `element.weights`

associated with treatment or
control group observations within the stratum is multiplied by the
fraction of clusters in that stratum that are associated with the
treatment rather than the control condition. (Unless this
fraction is 0 or 1, in which case the stratum is downweighted to
0.)

The function also calculates univariate and multivariate inferential
statistics, targeting the hypothesis that assignment was random within strata. These
calculations also pool `element.weight`

-ed, within-stratum group means across strata,
but the default weighting of strata differs from that of the descriptive calculations, and is
determined by the `stratum.weights`

argument. By default, each stratum is weighted
in proportion to the product of the stratum mean of `element.weight`

s and
the harmonic mean *1/[(1/a +
1/b)/2]=2*a*b/(a+b)* of the number of treated units (a) and
control units (b) in the stratum; this weighting is optimal under
certain modeling assumptions (discussed in Kalton 1968, Hansen and
Bowers 2008). The multivariate assessment is based on a Mahalanobis-type
distance that combines each of the univariate mean differences while accounting
for correlations among them. It's similar to the Hotelling's T-squared statistic,
except standarized using a permutation covariance. See Hansen and Bowers (2008).

In contrast to the earlier function `xBalance`

that it is intended to replace,
`balanceTest`

accepts only binary assignment variables (for now).

`stratum.weights`

can be either a function or a numeric
vector of weights. If it is a numeric vector, it should be
non-negative and it should have stratum names as its names. (i.e.,
its names should be equal to the levels of the factor specified by
`strata`

.) If it is a function, it should accept one
argument, a data frame containing the variables in `data`

and
additionally `Tx.grp`

and `stratum.code`

, and return a
vector of non-negative weighting factors with stratum codes as names.
(To see the function that's applied by default,
do `getFromNamespace("harmonic", "RItools")`

.) These weighting factors
will be multipled by the stratum mean of `element.weights`

to determine
the stratum weights used for inferential calculations.

If the stratifying factor has NAs, these cases are dropped. On the other hand, if NAs in a covariate are found then those observations are dropped for descriptive calculations and "imputed" to the stratum mean of the variable for inferential calculations. When covariate values are dropped due to missingness, proportions of observations not missing on that variable are recorded and returned. The printed output presents non-missing proportions alongside of the variables themselves, distinguishing the former by placing them at the bottom of the list and enclosing the variable's name in parentheses. If a variable shares a missingness pattern with other another variable, its missingness information may be labeled with the name of the other variable in the output.

An object of class `c("xbal", "list")`

. There are
`plot`

, `print`

, and `xtable`

methods for class
`"xbal"`

; the `print`

method is demonstrated in the
examples.

Evidence pertaining to the hypothesis that a treatment variable is not associated with differences in covariate values is assessed by comparing the differences of means, without standardization, to their distributions under hypothetical shuffles of the treatment variable, a permutation or randomization distribution. For the unstratified comparison, this reference distribution consists of differences as the treatment assignments of clusters are freely permuted. For stratified comparisons, the reference distributions describes re-randomizations of this type performed separately in each stratum. Significance assessments are based on the large-sample Normal approximation to these reference distributions.

Ben Hansen and Jake Bowers and Mark Fredrickson

Hansen, B.B. and Bowers, J. (2008), “Covariate
Balance in Simple, Stratified and Clustered Comparative
Studies,” *Statistical Science* **23**.

Kalton, G. (1968), “Standardization: A technique to control for
extraneous variables,” *Applied Statistics* **17**,
118–136.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | ```
data(nuclearplants)
##No strata, default output
balanceTest(pr~ date + t1 + t2 + cap + ne + ct + bw + cum.n,
data=nuclearplants)
##No strata, all output
balanceTest(pr~ date + t1 + t2 + cap + ne + ct + bw + cum.n,
data=nuclearplants,
report=c("all"))
##Stratified, all output
balanceTest(pr~.-cost-pt + strata(pt),
data=nuclearplants,
report=c("adj.means", "adj.mean.diffs",
"chisquare.test", "std.diffs",
"z.scores", "p.values"))
##Comparing unstratified to stratified, just adjusted means and
#omnibus test
balanceTest(pr~ date + t1 + t2 + cap + ne + ct + bw + cum.n + strata(pt),
data=nuclearplants,
report=c("adj.means", "chisquare.test"))
##Comparing unstratified to stratified, just adjusted means and
#omnibus test
balanceTest(pr~ date + t1 + t2 + cap + ne + ct + bw + cum.n + strata(pt),
data=nuclearplants,
report=c("adj.means", "chisquare.test"))
##Missing data handling.
testdata<-nuclearplants
testdata$date[testdata$date<68]<-NA
##Comparing unstratified to stratified, just one-by-one wilcoxon
#rank sum tests and omnibus test of multivariate differences on
#rank scale.
balanceTest(pr~ date + t1 + t2 + cap + ne + ct + bw + cum.n + strata(pt),
data=nuclearplants,
report=c("adj.means", "chisquare.test"),
post.alignment.transform=rank)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.