PStrataInfo: Create an object that defines the principal strata

PStrataInfoR Documentation

Create an object that defines the principal strata

Description

PStrataInfo is a class of object that defines all principal strata to be considered, by specifying the potential value of each post-randomization confounding variable under each treatment arm.

Usage

PStrataInfo(strata, ER = NULL)

Arguments

strata

a list or a vector defining all principal strata. Details of the syntax are given in 'Details' below.

ER

a vector indicating on which strata exclusion restriction is assumed. Details are given in 'Details' below.

Details

Since definition of the principal strata appears fundamental and essential in principal stratification analyses, the creation of such an object is designed to be user-friendly - various ways are accommodated to create a PStrataInfo object, some possibly preferable over others under different settings.

There are mainly two ways to easily create a PStrataInfo object.

By string

To define the principal strata by strings, the strata argument should receive a named vector, each component being the description of one strata with the name of that strata. The naming does not affect the actual inference, but informative names can be helpful for users to distinguish among strata.

Each stratum is defined by the potential values of the post-randomization confounding variable D under each treatment arm. By convention, assume that the K treatment arms are numbered from 0 to K-1. Then, each stratum is defined by the tuple (D(0), \ldots, D(K-1)), which can be written compactly as a string. For example, under binary treatment, the never-takers (i.e. D(0) = D(1) = 0) can be represented by string "00" and the compliers (i.e. D(0) = 0, D(1) = 1) can be represented by string "01". Note that the value that the post-randomization confounding variable can take is limited between 0 to 9 for the string to be parsed correctly. This should be more than enough in most of the applications, and in cases where a number above 10 is needed, please create the PStrataInfo object by matrix (see below).

When multiple post-randomization confounding variables exist, the string for each confounding variable is concatenated with the symbol "|". For example, if D_0 and D_1 are both binary post-randomization confounding variables, the stratum defined by D_0(0) = D_0(1) = 0, D_1(0) = 0, D_1(1) = 1 can be represented by string "00|11". The order of these confounding variables should be the same as they appear in the S.formula parameter in PSObject.

A common assumption in practice is the exclusion restriction (ER) assumption, which assumes that the causal effect of the treatment on the outcome is totally realized through the post-randomization confounding variables. For example, the ER assumption on the stratum of never-takers can be interpreted as the outcome is identically distributed across the treated and control group, because all causal effect of the treatment is realized through the post-randomization variable, which is the same (0) under both treatment arms. To assume ER for some stratum, simply put an asterisk "*" at the end of the string, such as "00*" for the never-taker stratum. Note that under the context of multiple post-randomization variables, the package treats all such variables as a unity. The outcome is assumed to be identical under different treatment arms only when all post-randomization variables remain the same under these treatment arms.

Another way to specify the stratum where ER is assumed is to use the ER argument. It either takes a logical vector of the same length of strata with TRUE indicating ER is assumed and FALSE otherwise, or takes a character vector with the names of all strata where ER is to be assumed upon. When names to the strata are not provided in strata, the strata can be referred to by their canonical name, which is the string used to define the stratum with asterisks removed. For example, the strata "00|11*" can be referred to with name "00|11".

By matrix

To define the principal strata by matrices, the strata argument should receive a named list, each component being a matrix. The number of rows matches the number of post-randomization variables, and the number of columns matches that of possible treatment arms. For any fixed row i, column j stores the potential value of the i-th post-randomization variable under treatment arm j.

When this approach is used, there is no shorthand to specify ER assumption. The ER argument is required to do this.

Warning: When ER assumption is specified in both strata and ER argument, the shorthand notation for ER in strata is ignored, and a warning is given regardless of whether the specification given by strata and ER actually match.

Value

an object of class PSStrataInfo, which is a list of the following components.

num_strata

number of principal strata defined

num_treatment

number of treatment arms

num_postrand_var

number of post-randomization variables

max_postrand_level

integer vector, the biggest number used by each post-randomization variable

strata_matrix

integer matrix, each row corresponding to one stratum and each column corresponding to one treatment arm. The matrix is designed only for internal use.

ER_list

logical vector, each component corresponding to one stratum, indicating whether ER is assumed for the specific stratum

strata_names

character vector, the names of all strata

Examples

PStrataInfo(strata = c(n = "00*", c = "01", a = "11"))
PStrataInfo(
  strata = list(n = c(0, 0), c = c(0, 1), a = c(1, 1)), 
  ER = c(TRUE, FALSE, FALSE)
)
PStrataInfo(
  strata = list(n = c(0, 0), c = c(0, 1), a = c(1, 1)), 
  ER = c("n")
)


PStrata documentation built on May 29, 2024, 8:17 a.m.