rrtest_clust: Residual randomization test under cluster invariances
In RRI: Residual Randomization Inference for Regression Models

Description Usage Arguments Details Value Note See Also Examples

This function tests the specified linear hypothesis in model assuming that the errors have some form of cluster invariance determined by type within the clusters determined by clustering.

rrtest_clust(
  model,
  type,
  clustering = NULL,
  num_R = 999,
  alpha = 0.05,
  val_type = "decision"
)

`model`	Regression model and hypothesis. See example_model for details.
`type`	A `character`, either "perm", "sign" or "double".
`clustering`	A `List` that specifies a clustering of datapoint indexes 1, ..., n. See example_clustering. If NULL it takes default value according to `type` (see Note)
`num_R`	Number of test statistic values to calculate in the test.
`alpha`	Nominal test level (between 0 to 1).
`val_type`	The type of return value.

For the regression y = X * beta + e, this function is testing the following linear null hypothesis:

H0: lam' beta = lam[1] * beta[1] + ... + lam[p] * beta[p] = lam0,

where y, X, lam, lam0 are specified in model. The assumption is that the errors, e, have some form of cluster invariance. Specifically:

If type = "perm" then the errors are assumed exchangeable within the specified clusters:

(e_1, e_2, ..., e_n) ~ cluster_perm(e_1, e_2, ..., e_n),

where ~ denotes equality in distribution, and cluster_perm is any random permutation within the clusters defined by clustering. Internally, the test repeatedly calculates a test statistic by randomly permuting the residuals within clusters.
If type = "sign" then the errors are assumed sign-symmetric within the specified clusters:

(e_1, e_2, ..., e_n) ~ cluster_signs(e_1, e_2, ..., e_n),

where cluster_signs is a random signs flip of residuals on the cluster level. Internally, the test repeatedly calculates a test statistic by randomly flipping the signs of cluster residuals.
If type = "double" then the errors are assumed both exchangeable and sign symmetric within the specified clusters:

(e_1, e_2, ..., e_n) ~ cluster_signs(cluster_perm(e_1, e_2, ..., e_n)),

Internally, the test repeatedly calculates a test statistic by permuting and randomly flipping the signs of residuals on the cluster level.

If val_type = "decision" (default) we get the test binary decision (1=REJECT H0).

If val_type = "pval" we get the test p-value.

If val_type = "full" we get the full test output, i.e., a List with elements tobs, tvals, the observed and randomization values of the test statistic, respectively.

If clustering is NULL then it will be assigned a default value:

list(1:n)if type = "perm", where n is the number of datapoints;
as.list(1:n) if type = "sign" or "double".

As in bootstrap num_R is usually between 1000-5000.

Life after bootstrap: residual randomization inference in regression models (Toulis, 2019)

https://www.ptoulis.com/residual-randomization

# 1. Validity example
set.seed(123)
n = 50
X = cbind(rep(1, n), 1:n/n)
beta = c(0, 0)
rej = replicate(200, {
  y = X %*% beta  + rt(n, df=5)
  model = list(y=y, X=X, lam=c(0, 1), lam0=0)  # H0: beta2 = 0
  rrtest_clust(model, "perm")
})
mean(rej)  # Should be ~ 5% since H0 is true.

# 2. Heteroskedastic example
set.seed(123)
n = 200
X = cbind(rep(1, n), 1:n/n)
beta = c(-1, 0.2)
ind = c(rep(0, 0.9*n), rep(1, .1*n))  # cluster indicator
y = X %*% beta + rnorm(n, sd= (1-ind) * 0.1 + ind * 5) # heteroskedastic
confint(lm(y ~ X + 0))  # normal OLS does not reject H0: beta2 = 0
cl = list(which(ind==0), which(ind==1))
model = list(y=y, X=X, lam=c(0, 1), lam0=0)

rrtest_clust(model, "sign")  # errors are sign symmetric regardless of cluster.
# Cluster sign test does not reject because of noise.

rrtest_clust(model, "perm", cl)  # errors are exchangeable within clusters
# Cluster permutation test rejects because inference is sharper.