View source: R/subgroup_detect.R
subgroup_detect | R Documentation |
Tests for the existence of a subgroup with an enhanced treatment effect. The subgroup of interest is represented by \{θ:θ^T X≥ 0\}. The test returns a p-value for H_0:τ=0, where τ is the treatment effect in this subgroup. If H_0 is rejected, estimates for θ can be used to obtain the estimated subgroup.
subgroup_detect(outcome, propen, data, K = 1000L, M = 1000L, seed = NULL)
outcome |
A formula object. The linear model for the outcome regression. The left-hand-side variable must be the response. R function lm will be used to estimate model parameters. The response must be continuous. |
propen |
A formula object. The model for the propensity score. The left-hand-side variable must be the treatment variable. R function glm will be used with input option family = binomial(link="logit") to estimate model parameters. The treatment must be binary. |
data |
A data.frame object. All covariates, treatment, and response variables. Note that the treatment must be binary and that the response must be continuous. |
K |
An integer object. The number of random sampled points on the unit ball surface \{θ:||θ||^2=1\}. These randomly sampled points are used for approximating the Gaussian process in the null and local alternative distributions of the test statistic with multivariate normal distributions. It is recommended that K be set to 10^p, where p is the number of parameters in the outcome model. Note that it is recommended that the number of covariates be less than 10 for this implementation. Default value is 1000. |
M |
An integer object. The number of resamplings of the perturbed test statistic. This sample is used to calculate the critical value of the test. Default and minimum values are 1000. |
seed |
An integer object or NULL. If integer, the seed for random number generation, set at the onset of the calculation. If NULL, current seed in R environment is used. |
In this function, a linear model with least squares estimate is used for fitting the baseline model μ(X), and a logistic model with maximum likelihood estimate is used for fitting the propensity score model P(a=1|X). These settings cannot be changed by the user.
A list consisting of
outcome |
An lm object. The object returned by the lm fit of the outcome. |
propen |
A glm object. The object returned by the glm fit of the propensity. |
p_value |
A numeric object. The p-value of the test. |
theta |
A named numeric vector. The change-plane parameter estimates for subgroup. |
prop |
A numeric object. The proportion of sampled points on θ unit ball surface that are used for calculating test statistic. For some values of theta, the subgroup contains no samples or all samples. These are discarded. |
seed |
If seed was provided as input, the user specified integer seed. If seed was not provided, not present. |
Ailin Fan, Rui Song, and Wenbin Lu, (2016). Change-plane analysis for subgroup detection and sample size calculation, Journal of the American Statistical Association, in press.
#set parameters tau <- 0.5 theta_t <- c(-0.15,0.3,sqrt(1-(-0.15)^2-(0.3)^2)) beta <- c(1,1,1) sigma <- 0.5 n <- 50 p <- 2 #generate data x1 <- rbinom(n,size=1,prob=0.5) x2 <- runif(n,min=-1,max=1) X <- cbind(1,x1,x2) a <- rbinom(n,1,prob=0.5) y <- drop(X%*%beta) + tau*a*(drop(X%*%theta_t)>=0) + rnorm(n,0,sigma) data <- data.frame(X[,2:3], a, y) subgroup_detect(outcome = y~x1+x2, propen = a~x1+x2, data = data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.