# get_elts: The function wrapper to get the elements necessary for... In genscore: Generalized Score Matching Estimators

## Description

The function wrapper to get the elements necessary for calculations for all settings.

## Details

Computes the Γ matrix and the g vector for generalized score matching.

Here, Γ is block-diagonal, and in the non-profiled non-centered setting, the j-th block is composed of Γ_{KK,j}, Γ_{Kη,j} and its transpose, and finally Γ_{ηη,j}. In the centered case, only Γ_{KK,j} is computed. In the profiled non-centered case,

Γ_j=Γ_{KK,j}-Γ_{Kη,j}Γ_{ηη,j}^(-1)Γ_{Kη}'.

Similarly, in the non-profiled non-centered setting, g can be partitioned p parts, each with a p-vector g_{K,j} and a scalar g_{η,j}. In the centered setting, only g_{K,j} is needed. In the profiled non-centered case,

g_j=g_{K,j}-Γ_{Kη,j}Γ_{ηη,j}^(-1)g_{η,j}.

The formulae for the pieces above are

Γ_{KK,j}=1/n*∑_{i=1}^n h(Xij)*Xij^(2a-2)*Xi^a*(Xi^a)',

Γ_{Kη,j}=-1/n*∑_{i=1}^n h(Xij)*Xij^(a+b-2)*Xi^a,

Γ_{ηη,j}=1/n*∑_{i=1}^n h(Xij)*Xij^(2b-2),

g_{K,j}=1/n*∑_{i=1}^n (h'(Xij)*Xij^(a-1)+(a-1)*h(Xij)*Xij^(a-2))*Xi^a+a*h(Xij)*Xij^(2a-2)*e_{j,p},

g_{η,j}=1/n*∑_{i=1}^n -h'(Xij)*Xij^(b-1)-(b-1)*h(Xij)*Xij^(b-2)),

where e_{j,p} is the p-vector with 1 at the j-th position and 0 elsewhere.

In the profiled non-centered setting, the function also returns t1 and t2 defined as

t1=Γ_{ηη}^(-1)g_{η}, t2=Γ_{ηη}^(-1)Γ_{Kη}',

so that \hat{η}=t1-t2*vec(\hat{K}).

## Value

A list that contains the elements necessary for estimation.

 n The sample size. p The dimension. centered The centered setting or not. Same as input. scale The scaling method. Same as input. diagonal_multiplier The diagonal multiplier. Same as input. diagonals_with_multiplier A vector that contains the diagonal entries of Γ after applying the multiplier. domain_type The domain type. Same as domain$type in the input. setting The setting. Same as input. g_K The g vector. In the non-profiled non-centered setting, this is the g sub-vector corresponding to K. A p^2-vector. Not returned if setting == "gaussian" && domain$type == "R" since it is just diag(p). Gamma_K The Gamma matrix with no diagonal multiplier. In the non-profiled non-centered setting, this is the Γ sub-matrix corresponding to K. A vector of length p^2 if setting == "gaussian" && domain$type == "R" or p^3 otherwise. g_eta Returned in the non-profiled non-centered setting. The g sub-vector corresponding to η. A p-vector. Not returned if setting == "gaussian" && domain$type == "R" since it is just numeric(p). Gamma_K_eta Returned in the non-profiled non-centered setting. The Γ sub-matrix corresponding to interaction between K and η. If setting == "gaussian" && domain$type == "R", returns a vector of length p, or p^2 otherwise. Gamma_eta Returned in the non-profiled non-centered setting. The Γ sub-matrix corresponding to η. A p-vector. Not returned if setting == "gaussian" && domain$type == "R" since it is just rep(1,p). t1,t2 Returned in the profiled non-centered setting, where the η estimate can be retrieved from t1-t2*\hat{K} after appropriate resizing.

If domain$type == "simplex", the following are also returned.  Gamma_K_jp A matrix of size p by p(p-1). The (j-1)*p+1 through j*p columns represent the interaction matrix between the j-th column and the m-th column of K. Gamma_Kj_etap Non-centered only. A matrix of size p by p(p-1). The j-th column represents the interaction between the j-th column of K and eta[p]. Gamma_Kp_etaj Non-centered only. A matrix of size p by p(p-1). The j-th column represents the interaction between the p-th column of K and eta[j]. Note that it is equal to Gamma_Kj_etap if setting does not contain substring "sum0". Gamma_eta_jp Non-centered only. A vector of size p-1. The j-th component represents the interaction between eta[j] and eta[p]. ## Examples   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 n <- 30 p <- 10 eta <- rep(0, p) K <- diag(p) dm <- 1 + (1-1/(1+4*exp(1)*max(6*log(p)/n, sqrt(6*log(p)/n)))) # Gaussian on R^p: domain <- make_domain("R", p=p) x <- mvtnorm::rmvnorm(n, mean=solve(K, eta), sigma=solve(K)) # Equivalently: x2 <- gen(n, setting="gaussian", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=NULL, burn_in=1000, thinning=100, verbose=FALSE) elts <- get_elts(NULL, x, "gaussian", domain, centered=TRUE, scale="norm", diag=dm) elts <- get_elts(NULL, x, "gaussian", domain, FALSE, profiled=FALSE, scale="sd", diag=dm) # Gaussian on R_+^p: domain <- make_domain("R+", p=p) x <- tmvtnorm::rtmvnorm(n, mean = solve(K, eta), sigma = solve(K), lower = rep(0, p), upper = rep(Inf, p), algorithm = "gibbs", burn.in.samples = 100, thinning = 10) # Equivalently: x2 <- gen(n, setting="gaussian", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=NULL, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 1, 3) elts <- get_elts(h_hp, x, "gaussian", domain, centered=TRUE, scale="norm", diag=dm) # Gaussian on sum(x^2) > 1 && sum(x^(1/3)) > 1 with x allowed to be negative domain <- make_domain("polynomial", p=p, rule="1 && 2", ineqs=list(list("expression"="sum(x^2)>1", abs=FALSE, nonnegative=FALSE), list("expression"="sum(x^(1/3))>1", abs=FALSE, nonnegative=FALSE))) xinit <- rep(sqrt(2/p), p) x <- gen(n, setting="gaussian", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=xinit, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 1, 3) elts <- get_elts(h_hp, x, "gaussian", domain, centered=FALSE, profiled_if_noncenter=TRUE, scale="", diag=dm) # exp on ([0, 1] v [2,3])^p domain <- make_domain("uniform", p=p, lefts=c(0,2), rights=c(1,3)) x <- gen(n, setting="exp", abs=FALSE, eta=eta, K=K, domain=domain, xinit=NULL, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 1.5, 3) elts <- get_elts(h_hp, x, "exp", domain, centered=TRUE, scale="", diag=dm) elts <- get_elts(h_hp, x, "exp", domain, centered=FALSE, profiled_if_noncenter=FALSE, scale="", diag=dm) # gamma on {x1 > 1 && log(1.3) < x2 < 1 && x3 > log(1.3) && ... && xp > log(1.3)} domain <- make_domain("polynomial", p=p, rule="1 && 2 && 3", ineqs=list(list("expression"="x1>1", abs=FALSE, nonnegative=TRUE), list("expression"="x2<1", abs=FALSE, nonnegative=TRUE), list("expression"="exp(x)>1.3", abs=FALSE, nonnegative=TRUE))) set.seed(1) xinit <- c(1.5, 0.5, abs(stats::rnorm(p-2))+log(1.3)) x <- gen(n, setting="gamma", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=xinit, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 1.5, 3) elts <- get_elts(h_hp, x, "gamma", domain, centered=TRUE, scale="", diag=dm) elts <- get_elts(h_hp, x, "gamma", domain, centered=FALSE, profiled_if_noncenter=FALSE, scale="", diag=dm) # a0.6_b0.7 on {x in R_+^p: sum(log(x))<2 || (x1^(2/3)-1.3x2^(-3)<1 && exp(x1)+2.3*x2>2)} domain <- make_domain("polynomial", p=p, rule="1 || (2 && 3)", ineqs=list(list("expression"="sum(log(x))<2", abs=FALSE, nonnegative=TRUE), list("expression"="x1^(2/3)-1.3x2^(-3)<1", abs=FALSE, nonnegative=TRUE), list("expression"="exp(x1)+2.3*x2^2>2", abs=FALSE, nonnegative=TRUE))) xinit <- rep(1, p) x <- gen(n, setting="ab_3/5_7/10", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=xinit, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 1.4, 3) elts <- get_elts(h_hp, x, "ab_3/5_7/10", domain, centered=TRUE, scale="", diag=dm) elts <- get_elts(h_hp, x, "ab_3/5_7/10", domain, centered=FALSE, profiled_if_noncenter=TRUE, scale="", diag=dm) # log_log model on {x in R_+^p: sum_j j * xj <= 1} domain <- make_domain("polynomial", p=p, ineqs=list(list("expression"=paste(paste(sapply(1:p, function(j){paste(j, "x", j, sep="")}), collapse="+"), "<1"), abs=FALSE, nonnegative=TRUE))) x <- gen(n, setting="log_log", abs=FALSE, eta=eta, K=K, domain=domain, finite_infinity=100, xinit=NULL, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 2, 3) elts <- get_elts(h_hp, x, "log_log", domain, centered=TRUE, scale="", diag=dm) elts <- get_elts(h_hp, x, "log_log", domain, centered=FALSE, profiled_if_noncenter=FALSE, scale="", diag=dm) # Example of using the uniform distance function to boundary as in Liu (2019) g0 <- function(x) { row_min <- apply(x, 1, min) row_which_min <- apply(x, 1, which.min) dist_to_sum_boundary <- apply(x, 1, function(xx){ (1 - sum(1:p * xx)) / sqrt(p*(p+1)*(2*p+1)/6)}) grad_sum_boundary <- -(1:p) / sqrt(p*(p+1)*(2*p+1)/6) g0 <- pmin(row_min, dist_to_sum_boundary) g0d <- t(sapply(1:nrow(x), function(i){ if (row_min[i] < dist_to_sum_boundary[i]){ tmp <- numeric(ncol(x)); tmp[row_which_min[i]] <- 1 } else {tmp <- grad_sum_boundary} tmp })) list("g0"=g0, "g0d"=g0d) } elts <- get_elts(NULL, x, "exp", domain, centered=TRUE, profiled_if_noncenter=FALSE, scale="", diag=dm, unif_dist=g0) # log_log_sum0 model on the simplex with K having row and column sums 0 (Aitchison model) domain <- make_domain("simplex", p=p) K <- -cov_cons("band", p=p, spars=3, eig=1) diag(K) <- diag(K) - rowSums(K) # So that rowSums(K) == colSums(K) == 0 eigen(K)$val[(p-1):p] # Make sure K has one 0 and p-1 positive eigenvalues x <- gen(n, setting="log_log_sum0", abs=FALSE, eta=eta, K=K, domain=domain, xinit=NULL, seed=2, burn_in=1000, thinning=100, verbose=FALSE) h_hp <- get_h_hp("min_pow", 2, 3) h_hp_dx <- h_of_dist(h_hp, x, domain) # h and h' applied to distance from x to boundary # Does not assume K has 0 row and column sums elts_simplex_0 <- get_elts(h_hp, x, "log_log", domain, centered=TRUE, profiled=FALSE, scale="", diag=1.5) # If want K to have row sums and column sums equal to 0 (Aitchison); estimate off-diagonals only elts_simplex_1 <- get_elts(h_hp, x, "log_log_sum0", domain, centered=FALSE, profiled=FALSE, scale="", diag=1.5) # All entries corresponding to the diagonals of K should be 0: max(abs(sapply(1:p, function(j){c(elts_simplex_1$Gamma_K[j, (j-1)*p+1:p], elts_simplex_1$Gamma_K[, (j-1)*p+j])}))) max(abs(diag(elts_simplex_1$Gamma_K_eta))) max(abs(diag(matrix(elts_simplex_1$g_K, nrow=p)))) 

genscore documentation built on April 28, 2020, 1:06 a.m.