three_prop_cv: Estimate 3prop weights using double cross-validation

Description Usage Arguments Details Value References Examples

Description

This function estimates the weights for the 3prop algorithm using a double cross-validation procedure.

Usage

1
2
three_prop_cv(M, y, R = 3L, n_folds = 3L, reg = 1e-09,
  method = "LDA")

Arguments

M

normalized affinity matrix, as returned by normalize_A(), of size N x N. Must be of class sparseMatrix.

y

vector of labels, of length N. Must be of class sparseVector.

R

maximum length of the random walks to consider. The default 3 is the usual for 3prop.

n_folds

number of CV folds to use in both inner and outer CV loops.

reg

regularization parameter for LDA.

method

string for method to compute the coefficients. Can be "LDA" (default) or "Ridge". Both use the parameter reg.

Details

By double cross-validation (CV), we mean that there are two CV loops, one nested within the other. The first loop deals with calculating the random walk features, while the second loop deals with estimating the coefficients alpha associated to those features.

Additionally, the function returns computes the area under the ROC curve (AUROC) for every fold in the outer CV loop.

Value

A list with two elements: a matrix of size R x n_folds, containing the weights estimated for each (outer) CV iteration, and a vector of length n_folds containing the AUROC for each such iteration.

References

Mostafavi, S., Goldenberg, A., & Morris, Q. (2012). Labeling nodes using three degrees of propagation. PloS one, 7(12), e51947.

Examples

1
2
3
4
sim_SBM = simulate_simple_SBM(N = 2500L, p_1 = 0.2, D = 0.04, R = 0.25)
M = normalize_A(sim_SBM$A, "asym")
three_prop_cv(M=M, y=sim_SBM$y)
three_prop_cv(M=M, y=sim_SBM$y, method = "Ridge")

miguelbiron/threepRop documentation built on May 29, 2019, 9:31 a.m.