library(SatTagSim) library(knitr) username = as.character(Sys.info()[7]) setwd(paste0('C:/Users/', username,'/Google Drive/PHD/COLLABORATIONS/SCRS_SIMS/')) datadir = 'D:/SKYDRIVE_BACKUP/PHD_SIMS/SCRS_SIMS' mycol = colorRampPalette(c("lightcyan", "royalblue", "blue", "lemonchiffon", "orange", "red"), space = "Lab") load('sim_comparison_models.Rdata')
eatl.ct = get.trans.prob(eatl.datbox, perc = F) watl.big.ct = get.trans.prob(watl.big.datbox, perc = F) watl.small.ct = get.trans.prob(watl.small.datbox, perc = F) eatl = get.trans.prob(eatl.datbox, perc = T) watl.big = get.trans.prob(watl.big.datbox, perc = T) watl.small = get.trans.prob(watl.small.datbox, perc = T)
The expected number of times the outcome $i$ observed over $n$ trials is \begin{equation} {\displaystyle \operatorname {E} (X_{i})=np_{i}\,} \end{equation}
\begin{equation} \operatorname{v}(X_i) = np_i(1-p_i) \end{equation}
where $n$ is the number of trials and $p$ is the probability. Here, I consider each row of a movement matrix a multinomial. This is what we discussed in the proposal defense. code modified from \color{blue}here
I'm not sure how to represent $n$, but I assumed the transition between seasons was a single trial for each row.
# function to calculate variance of a multinomial row get.mvar = function(x){ p = drop(x) # use the percentages v = p*(1-p) # c = outer(p, 1-p) # vcv = diag(c) v } # function to sample from a posterior multinomial distribution get.mvar.samp = function(vmu, vsd) { o = rtruncnorm(1, mean = vmu, sd = vsd, a = 0, 1) o[is.nan(o)] = 0 o } # example for seasonal transition matrices vmat = lapply(watl.big, function(x) apply(x, 1, get.mvar))
kable(vmat[[1]], row.names = 1:7, col.names = 1:7, caption = "Multinomial variance for Western bluefin >185cm, in Winter", digits = 4)
\newpage
Example using Western bluefin >185cm
ex7 = expand.grid(1:7, 1:7) par(mfrow=c(2,2)) sapply(1:4, function(x) { image.plot(1:7, 1:7, vmat[[x]], col = mycol(49), xlab = '', ylab = '') text(ex7[,1], ex7[,2], round(as.vector(watl.big[[x]]),3), font = 2) title(names(watl.big)[x]) } )
\newpage
samp = data.frame(sapply(1:7, function(x) get.mvar.samp(watl.big[[1]][x,], vmat[[1]][x,]))) names(samp) = 1:7 kable(samp, row.names = 1:7, caption = 'Posterior sample: Western Atlantic bluefin > 185cm, Winter season', digits = 4)
\newpage
returns the pchisq value
$\chi^{2}$ multinomial goodness of fit test from \color{blue}here
\begin{equation} {\displaystyle \operatorname {\chi^{2}} = \sum_{i} \frac{(f_{i}-e_{i})^2}{e_{i}}\,} \end{equation}
# first arguement is counts, second argument is probability. In this way, row-wise comparisons of two matrices may be made # p-values of the log-linear (chisq test) are returned get.loglin <- function(mat1, mat2){ fm = NULL for(i in 1:nrow(mat1)){ fm[[i]] = loglin(rbind(mat1[i,], mat2[i,]+1e-30),c(1), print = F, fit = F) } unlist(lapply(fm, function(x) 1-pchisq(x$lrt, x$df))) } get.mgof <- function(mat1, mat2){ fm = NULL for(i in 1:nrow(mat1)){ fm[i] = chisq.test(mat1[i,], p = mat2[i,]+1e-30)$p.value } fm }
I tried several versions of log-linear and $\chi^2$ tests. Log linear models, and their GLM versions, are essentially wrappers on $\chi^2$ tests for ease of use and flexibility in using contingency tables in various formats.
Since each row of the movments matrices is dependent on it's associated region i nthe previous time step, the matrix as a whole cannot be considered a contingency table. Each row, however, may be considered it's own 1-dimensional contingency table.
To compare any two rows, within a matrix or between matrices, the best method appeared to be the $\chi^2$ test. To set this up, counts within a row are compared to the probablity in another row. The probability from the second matirx row is simply $\frac{x_{i}}{\sum x_{i} }$, and is the markovian result we have used to date.
To do goodness of fit comparisons, I used the get.mgof
function. If any probabilities are $0$, the $\chi^2$ has issues. I added $1e-30$ to alleviate this problem. Perhaps there is a better way?
\newpage
bw.bs = get.mgof(watl.big.ct[[1]], watl.big[[2]]) print(bw.bs)
# same row, same season, large and small WABFT # (bg.sm = get.loglin(watl.big.ct[[1]], watl.small[[1]])) bg.sm = data.frame(sapply(1:4, function(x) get.mgof(watl.big.ct[[x]], watl.small[[x]]))) names(bg.sm) = names(watl.big) kable(bg.sm, row.names = 1:7, caption = "Western Atlantic bluefin > 185 cm vs < 185 cm", digits = 4)
\newpage
e.w.s = data.frame(sapply(1:4, function(x) get.mgof(watl.big.ct[[x]], eatl[[x]]))) names(e.w.s) = names(watl.big) # library(xtable) # print(xtable(e.w.s, include.rownames = T), type = 'html') kable(e.w.s, row.names = 1:7, caption = "Western Atlantic bluefin > 185 cm vs Eastern Atlantic bluefin", digits = 4)
# get overall transition rate; multiply the seasonal matrices b1 = watl.big b2 = eatl ball.w = b1[[1]]%*%b1[[2]]%*%b1[[3]]%*%b1[[4]] ball.e = b2[[1]]%*%b2[[2]]%*%b2[[3]]%*%b2[[4]] (e.w = get.loglin(ball.e, ball.w))
\newpage
this helped answer a question I had earlier
plot(pchisq(rep(50,100), df = 1:100), typ = 'l', lwd= 2, col = 4, xlab = 'DF', ylab = 'Value') title('pchisq value')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.