dcsis | R Documentation |
Performs distance correlation sure independence screening \insertCiteli2012featuredcortools with some additional options (such as calculating corresponding tests).
dcsis( X, Y, k = floor(nrow(X)/log(nrow(X))), threshold = NULL, calc.cor = "spearman", calc.pvalue.cor = FALSE, return.data = FALSE, test = "none", adjustp = "none", b = 499, bias.corr = FALSE, use = "all", algorithm = "auto" )
X |
A dataframe or matrix. |
Y |
A vector-valued response having the same length as the number of rows of X. |
k |
Number of variables that are selected (only used when threshold is not provided). |
threshold |
If provided, variables with a distance correlation larger than threshold are selected. |
calc.cor |
If set as "pearson", "spearman" or "kendall", a corresponding correlation matrix is additionally calculated. |
calc.pvalue.cor |
logical; IF TRUE, a p-value based on the Pearson or Spearman correlation matrix is calculated (not implemented for calc.cor = "kendall") using Hmisc::rcorr. |
return.data |
logical; specifies if the dcmatrix object should contain the original data. |
test |
Allows for additionally calculating a test based on distance Covariance. Specifies the type of test that is performed, "permutation" performs a Monte Carlo Permutation test. "gamma" performs a test based on a gamma approximation of the test statistic under the null. "conservative" performs a conservative two-moment approximation. "bb3" performs a quite precise three-moment approximation and is recommended when computation time is not an issue. |
adjustp |
If setting this parameter to "holm", "hochberg", "hommel", "bonferroni", "BH", "BY" or "fdr", corresponding adjusted p-values are additionally returned for the distance covariance test. |
b |
specifies the number of random permutations used for the permutation test. Ignored for all other tests. |
bias.corr |
logical; specifies if the bias corrected version of the sample distance covariance \insertCitehuo2016fastdcortools should be calculated. |
use |
"all" uses all observations, "complete.obs" excludes NAs, "pairwise.complete.obs" uses pairwise complete observations for each comparison. |
algorithm |
specifies the algorithm used for calculating the distance covariance. "fast" uses an O(n log n) algorithm if the observations are one-dimensional and metr.X and metr.Y are either "euclidean" or "discrete", see also \insertCitehuo2016fast;textualdcortools. "memsave" uses a memory saving version of the standard algorithm with computational complexity O(n^2) but requiring only O(n) memory. "standard" uses the classical algorithm. User-specified metrics always use the classical algorithm. "auto" chooses the best algorithm for the specific setting using a rule of thumb. "memsave" is typically very inefficient for dcsis and should only be applied in exceptional cases. |
dcmatrix object with the following two additional slots:
name selected |
description indices of selected variables. |
name dcor.selected |
distance correlation of the selected variables and the response Y. |
berschneider2018complexdcortools \insertRefdueck2014affinelydcortools
\insertRefhuang2017statisticallydcortools
\insertRefhuo2016fastdcortools
\insertRefli2012featuredcortools
\insertRefszekely2007dcortools
\insertRefszekely2009browniandcortools
X <- matrix(rnorm(1e5), ncol = 1000) Y <- sapply(1:100, function(u) sum(X[u, 1:50])) + rnorm(100) a <- dcsis(X, Y)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.