crispify: Observation-Space Stochastic Correction In widals: Weighting by Inverse Distance with Adaptive Least Squares for Massive Space-Time Data

Description

Improve observation-space predictions using 'left over' spacial correlation between model residuals

Usage

 ```1 2``` ```crispify(locs1, locs2, Z.delta, z.lags.vec, geodesic, alpha, flatten, self.refs, lags, stnd.d = FALSE, log10cutoff = -16) ```

Arguments

 `locs1` Locations of supporting sites. An n x 3 matrix, first column is spacial x, second column is spacial y, third column contains relative temporal 'distance'. If the `geodesic` is `TRUE`, make sure latitude is in the first column. `locs2` Locations of interpolation sites. An n* x 3 matrix, where n* is the number of interpolation sites. See `locs1` above. `Z.delta` Observed residuals. A τ x x matrix. `z.lags.vec` Temporal lags. An integer vector or scalar. `geodesic` Use geodesic distance? Boolean. If true, distance (used internally) is in units kilometers. `alpha` The WIDALS distance rate hyperparameter. A scalar non-negative number. `flatten` The WIDALS 'flattening' hyperparameter. A scalar non-negative number. Typically between 0 and some number slightly greater than 1. When 0, no crispification. `self.refs` Which sites are self-referencing? An integer vector of (zero-based) lag indices, OR a scalar set to `-1`. This argument only has meaning when `locs1` is identical to `locs2`. If the `lags` argument is, say, 0, then it would be pointless to smooth predictions with existing values. In this case, we can set `self.refs = 0`. If `locs1` is NOT the same as `locs2`, then set this argument to `-1`. `lags` Temporal lags. An integer vector or scalar. E.g., if the data's time increment is daily, then `lags = c(-1,0,1)` would have `crispify` smooth today's predictions using yesterdays, today's, and tomorrow's observed residuals. `stnd.d` Spacial compression. Boolean. `log10cutoff` Weight threshold. A scalar number. A value of, e.g., -10, will instruct `crispify` to ignore weights less than 10^(-10) when smoothing.

Details

This function is called inside `widals.predict` and `widals.snow`. It may be useful for the user in building their own WIDALS model extensions.

Value

A τ x x matrix.

`widals.predict`, `widals.snow`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121``` ```######### here's an itty-bitty example ######### simulate itty-bitty data tau <- 21 #### number of time points d.alpha <- 2 R.scale <- 1 sigma2 <- 0.01 F <- 1 Q <- 0 n.all <- 14 ##### number of spacial locations set.seed(9999) library(SSsimple) locs.all <- cbind(runif(n.all, -1, 1), runif(n.all, -1, 1)) #### random location of sensors D.mx <- distance(locs.all, locs.all, FALSE) #### distance matrix #### create measurement variance using distance and covariogram R.all <- exp(-d.alpha*D.mx) + diag(sigma2, n.all) Hs.all <- matrix(1, n.all, 1) #### constant mean function ##### use SSsimple to simulate system xsssim <- SS.sim(F=F, H=Hs.all, Q=Q, R=R.all, length.out=tau, beta0=0) Z.all <- xsssim\$Z ###### system observation matrix ######## suppose use the global mean as a prediction z.mean <- mean(Z.all) Z.delta <- Z.all - z.mean z.lags.vec <- rep(0, n.all) geodesic <- FALSE alpha <- 5 flatten <- 1 ## emmulate cross-validation, i.e., ## don't use observed site values to predict themselves (zero-based) self.refs <- 0 lags <- 0 locs1 <- cbind(locs.all, rep(0, n.all)) locs2 <- cbind(locs.all, rep(0, n.all)) Z.adj <- crispify(locs1, locs2, Z.delta, z.lags.vec, geodesic, alpha, flatten, self.refs, lags, stnd.d = FALSE, log10cutoff = -16) Z.adj Z.hat <- z.mean + Z.adj sqrt( mean( (Z.all - Z.hat)^2 ) ) ######### set flatten to zero -- this means no crispification Z.adj <- crispify(locs1, locs2, Z.delta, z.lags.vec, geodesic, alpha, flatten=0, self.refs, lags, stnd.d = FALSE, log10cutoff = -16) Z.adj Z.hat <- z.mean + Z.adj sqrt( mean( (Z.all - Z.hat)^2 ) ) ## The function is currently defined as function (locs1, locs2, Z.delta, z.lags.vec, geodesic, alpha, flatten, self.refs, lags, stnd.d = FALSE, log10cutoff = -16) { n1 <- nrow(locs1) n2 <- nrow(locs2) tau <- nrow(Z.delta) n.Zd <- ncol(Z.delta) Z.out <- rep(0, tau * n1) z.rep.in <- rep(0:(n.Zd - 1), length.out = n2) t.start <- max(0, -min(lags)) t.stop <- min(tau, tau - max(lags)) t.s.s <- c(t.start, t.stop) if (geodesic) { rlocs1 <- pi * locs1[, 1:2]/180 rlocs2 <- pi * locs2[, 1:2]/180 Z.out <- .C("crispify", as.double(cos(rlocs1[, 1]) * cos(rlocs1[, 2])), as.double(cos(rlocs1[, 1]) * sin(rlocs1[, 2])), as.double(sin(rlocs1[, 1])), as.double(locs1[, 3]), as.double(cos(rlocs2[, 1]) * cos(rlocs2[, 2])), as.double(cos(rlocs2[, 1]) * sin(rlocs2[, 2])), as.double(sin(rlocs2[, 1])), as.double(locs2[, 3]), as.double(as.vector(Z.delta)), as.double(Z.out), as.double(alpha), as.double(flatten), as.integer(self.refs), as.integer(length(self.refs)), as.integer(z.lags.vec), as.integer(z.rep.in), as.integer(n.Zd), as.integer(n1), as.integer(n2), as.integer(tau), as.integer(stnd.d), as.integer(t.s.s), as.integer(geodesic), as.double(log10cutoff))[[10]] } else { Z.out <- .C("crispify", as.double(locs1[, 1]), as.double(locs1[, 2]), as.double(0), as.double(locs1[, 3]), as.double(locs2[, 1]), as.double(locs2[, 2]), as.double(0), as.double(locs2[, 3]), as.double(as.vector(Z.delta)), as.double(Z.out), as.double(alpha), as.double(flatten), as.integer(self.refs), as.integer(length(self.refs)), as.integer(z.lags.vec), as.integer(z.rep.in), as.integer(n.Zd), as.integer(n1), as.integer(n2), as.integer(tau), as.integer(stnd.d), as.integer(t.s.s), as.integer(geodesic), as.double(log10cutoff))[[10]] } dim(Z.out) <- c(tau, n1) return(Z.out) } ```