Description Usage Arguments Details Value References See Also Examples
This function is used to conduct robust statistical test for means of multivariate data, after adjusting for known or unknown latent factors using the methods in Fan et al.(2017) and Zhou et al.(2017). It uses the Huber's loss function (Huber (1964)) to robustly estimate data parameters.
1 2 3 4 |
X |
a n x p data matrix with each row being a sample.
You wish to test a hypothesis for the mean of each column of |
H0 |
an optional p x 1 vector of the true value of the means (or difference in means if you are performing a two sample test). The default is the zero. |
fx |
an optional factor matrix with each column being a factor for |
Kx |
a optional number of factors to be estimated for |
Y |
an optional data matrix that must have the same number of columns as |
fy |
an optional factor matrix with each column being a factor for |
Ky |
a optional number of factors to be estimated for |
alternative |
an optional character string specifying the alternate hypothesis, must be one of "two.sided" (default), "greater" or "lesser". You can specify just the initial letter. |
alpha |
an optional level for controlling the false discovery rate (in decimals). Default is 0.05. Must be in (0,1). |
robust |
a boolean, specifying whether or not to use robust estimators for mean and variance. Default is TRUE. |
cv |
a boolean, specifying whether or not to run cross-validation for the tuning parameter. Default is TRUE. Only used if |
tau |
|
verbose |
a boolean specifying whether to print runtime updates to the console. Default is TRUE. |
... |
Arguments passed to the |
alternative = "greater"
is the alternative that X
has a larger mean than Y
.
If some of the underlying factors are known but it is suspected that there are more confounding factors that are unobserved: Suppose we have data X = μ + Bf + Cg + u, where f is observed and g is unobserved. In the first step, the user passes the data \{X,f\} into the main function. From the output, let us construct the residuals: Xres = X - Bf. Now pass Xres into the main function, without any factors. The output in this step is the final answer to the testing problem.
For two-sample test, the output values means
, stderr
, n
, nfactors
,loadings
are all lists containing two items, each pertaining to X
and Y
, indicated by a prefix X.
and Y.
respectively.
Number of rows and columns of the data matrix must be at least 4 in order to be able to calculate latent factors.
For details about multiple comparison correction, see farm.FDR
.
The tuning parameter = tau * sigma * optimal rate
where optimal rate
is the optimal rate for the tuning parameter. For details, see Fan et al.(2017). sigma
is the standard deviation of the data.
An object with S3 class farm.test
containing:
means |
estimated means |
stderr |
estimated standard errors |
pvalue |
unadjusted p values |
rejected |
the indices of rejected hypotheses, along with their corresponding p values, and adjusted p values, ordered from most significant to least significant |
alldata |
all the indices of the tested hypotheses, along with their corresponding p values, adjusted p values, and a column with 1 if declared siginificant and 0 if not |
loadings |
estimated factor loadings |
nfactors |
the number of (estimated) factors |
significant |
the number of means that are found significant |
... |
further arguments passed to methods. For complete list use the function |
Huber, P.J. (1964). "Robust Estimation of a Location Parameter." The Annals of Mathematical Statistics, 35, 73–101.
Fan, J., Ke, Y., Sun, Q. and Zhou, W-X. (2017). "FARM-Test: Factor-Adjusted Robust Multiple Testing with False Discovery Control", https://arxiv.org/abs/1711.05386.
Zhou, W-X., Bose, K., Fan, J. and Liu, H. (2017). "A New Perspective on Robust M-Estimation: Finite Sample Theory and Applications to Dependence-Adjusted Multiple Testing," Annals of Statistics, to appear, https://arxiv.org/abs/1711.05381.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | set.seed(100)
p = 100
n = 50
epsilon = matrix(rnorm( p*n, 0,1), nrow = n)
B = matrix(runif(p*3,-2,2), nrow=p)
fx = matrix(rnorm(3*n, 0,1), nrow = n)
mu = rep(0, p)
mu[1:5] = 2
X = rep(1,n)%*%t(mu)+fx%*%t(B)+ epsilon
output = farm.test(X, cv=FALSE)#robust, no cross-validation
output
#other robustification options
output = farm.test(X, robust = FALSE, verbose=FALSE) #non-robust
output = farm.test(X, tau = 3, cv=FALSE, verbose=FALSE) #robust, no cross-validation, specified tau
#output = farm.test(X) #robust, cross-validation, longer running
#two sample test
n2 = 25
epsilon = matrix(rnorm( p*n2, 0,1), nrow = n2)
B = matrix(rnorm(p*3,0,1), nrow=p)
fy = matrix(rnorm(3*n2, 0,1), nrow = n2)
Y = fy%*%t(B)+ epsilon
output = farm.test(X=X,Y=Y, robust=FALSE)
output = farm.test(X=X,Y=Y,Kx=0, cv = FALSE) #non-robust
names(output$means)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.