rdcSubset: Randomized dependence coefficients score on given subset

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/KDSNvarSelect.R

Description

Variable pre selection scoring for KDSN. Estimates the RDC score for a subset of variables.

Usage

1
2
  rdcSubset(binCode, x, y, k=20, s=1/6, f=sin, seedX=NULL, seedY=NULL, 
  rdcRep=1, trans0to1=TRUE)

Arguments

binCode

Specifies which set of variables of the covariates is used to explain the responses (binary vector). One to assiged inclusion and zero excludes variables.

x

Covariates data (numeric matrix).

y

Responses (numeric matrix).

k

Number of random features (integer scalar).

s

Variance of the random weights. Default is 1/6.

f

Non-linear transformation function. Default is sin.

seedX

Random number seed of normal distributed weights for covariates (integer scalar). Default is to randomly draw weights.

seedY

Random number seed of normal distributed weights for responses (integer scalar). Default is to randomly draw weights.

rdcRep

Gives the number of rdc repetitions. All repetitions are averaged per variable, to give more robust estimates. Default is to use one repetition.

trans0to1

Should the design matrix and response be transformed to the interval [0, 1]? (Logical). If the data is available in this for form, it can be evaluated much faster.

Details

Covariates are ranked according to their dependence with the response variable.

Value

RDC score (numeric scalar).

Author(s)

Thomas Welchowski welchow@imbie.meb.uni-bonn.de

References

David Lopez-Paz and Philipp Hennig and Bernhard Schoelkopf, (2013), The Randomized dependence coefficient, Proceedings of Neural Information Processing Systems (NIPS) 26, Stateline Nevada USA, C.J.C. Burges and L. Bottou and M. Welling and Z. Ghahramani and K.Q. Weinberger (eds.)

See Also

rdcPart, cancorRed, rdcVarOrder, rdcVarSelSubset

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#############################
# Cubic noisy association

# Generate 10 covariates
library(mvtnorm)
set.seed(3489)
X <- rmvnorm(n=200, mean=rep(0, 10))

# Generate responses based on some covariates
set.seed(-239247)
y <- 0.5*X[, 1]^3 - 2*X[, 2]^2 + X[, 3] - 1 + rnorm(200)

# Score of true subset
scoreTrue <- rdcSubset(binCode=c(rep(1, 3), rep(0, 7)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreTrue

# Only unneccessary variables
scoreFalse <- rdcSubset(binCode=c(rep(0, 3), rep(1, 7)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreFalse

# One important two important variables and some non causal variables
scoreMix <- rdcSubset(binCode=c(1, 0, 1, rep(0, 3), rep(1, 4)), 
x=X, y=y, seedX=1:10, seedY=-(1:10), rdcRep=10)
scoreMix

kernDeepStackNet documentation built on May 2, 2019, 8:16 a.m.