Estimate adjacency matrix for equivalent FLC distributions based on states

Description

This function estimates the adjacency matrix \mathbf{A} of all pairwise equivalent FLC distributions given the states s_1, …, s_K. See Details below.

Usage

1
2
estimate_state_adj_matrix(states = NULL, FLCs = NULL, pdfs.FLC = NULL, alpha = NULL, 
    distance = function(f, g) return(mean(abs(f - g))))

Arguments

states

vector of length N with entry i being the label k = 1, …, K of PLC i

FLCs

N \times n_f matrix of FLCs (only necessary if distance= "KS")

pdfs.FLC

N \times K matrix of all K state-conditional FLC densities evaluated at each FLC \ell^{+}_i, i=1, …, N (only necessary if distance = function(f, g) return(...)).

alpha

significance level for testing. Default: alpha=NULL (this will return a p-value matrix if method == "KS")

distance

either a Kolmogorov-Smirnov test (distance = "KS") or a function metric (e.g. L_q distance). For a distance function, distance requires as input a function of f and g that returns one value.

Default: distance = function(f, g) return(mean(abs(f-g))) \rightarrow L_1 distance.

Value

A K \times K adjacency matrix with a trimmed version of exp(-distance) or p-values. If alpha!=NULL then it returns the thresholded 0/1 matrix. However, here 1 stands for equivalent, i.e. not rejecting. The matrix is obtained by checking for pval>alpha (rather than the usual pval<alpha).

Details and user-defined distance function

The (i,j)th element of the adjacency matrix is defined as

\mathbf{A}_{ij} = distance(P(X \mid s_i), P(X \mid s_j)) = distance(f, g),

where distance is either

a metric

in the function space of pdfs f and g, or

a two sample test

for H_0: f=g, e.g. a Kolmogorov-Smirnov test (distance="KS").

Again we use a functional programming approach and allow the user to specify any valid distance/similarity function distance = function(f, g) return(...).

If distance="KS" the adjacency matrix contains p-values of a Kolmogorov-Smirnov test or the thresholded versions (if alpha!=NULL) - see Return for details.

Otherwise distance is an R function that takes as an input two vectors f and g (e.g. the wKDE estimates for two states), and returns a non-negative, real number to estimate their distance. Default is the L_1 distance distance = function(f, g) return(mean(abs(f-g))).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
WW <- matrix(runif(10000), ncol = 10)
WW <- normalize(WW)
temp_flcs <- cbind(rnorm(nrow(WW)))
temp_pdfs.FLC <- estimate_LC_pdfs(temp_flcs, WW)
AA_ks <- estimate_state_adj_matrix(states = weight_matrix2states(WW), FLCs = temp_flcs, 
    distance = "KS")
AA_L1 <- estimate_state_adj_matrix(pdfs.FLC = temp_pdfs.FLC)

par(mfrow = c(1, 2), mar = c(1, 1, 2, 1))
image2(AA_ks, zlim = c(0, 1), legend = FALSE, main = "Kolmogorov-Smirnov")
image2(AA_L1, legend = FALSE, main = "L1 distance")

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.