ctStanFit | R Documentation |
Fits a ctsem model specified via ctModel
with type either 'stanct' or 'standt'.
ctStanFit(
datalong,
ctstanmodel,
stanmodeltext = NA,
iter = 1000,
intoverstates = TRUE,
binomial = FALSE,
fit = TRUE,
intoverpop = "auto",
sameInitialTimes = FALSE,
stationary = FALSE,
plot = FALSE,
derrind = NA,
optimize = TRUE,
optimcontrol = list(),
nlcontrol = list(),
nopriors = NA,
priors = FALSE,
chains = 2,
cores = ifelse(optimize, getOption("mc.cores", 2L), "maxneeded"),
inits = NULL,
compileArgs = list(),
forcerecompile = FALSE,
saveCompile = TRUE,
savescores = FALSE,
savesubjectmatrices = FALSE,
saveComplexPars = FALSE,
gendata = FALSE,
control = list(),
verbose = 0,
vb = FALSE,
...
)
datalong |
long format data containing columns for subject id (numeric values, 1 to max subjects), manifest variables, any time dependent (i.e. varying within subject) predictors, and any time independent (not varying within subject) predictors. |
ctstanmodel |
model object as generated by |
stanmodeltext |
already specified Stan model character string, generally leave NA unless modifying Stan model directly. (Possible after modification of output from fit=FALSE) |
iter |
used when |
intoverstates |
logical indicating whether or not to integrate over latent states using a Kalman filter. Generally recommended to set TRUE unless using non-gaussian measurement model. |
binomial |
Deprecated. Logical indicating the use of binary rather than Gaussian data, as with IRT analyses.
This now sets |
fit |
If TRUE, fit specified model using Stan, if FALSE, return stan model object without fitting. |
intoverpop |
if 'auto', set to TRUE if optimizing and FALSE if using hmc. if TRUE, integrates over population distribution of parameters rather than full sampling. Allows for optimization of non-linearities and random effects. |
sameInitialTimes |
if TRUE, include an empty observation for every subject that has no observation at the earliest observation time of the dataset. This ensures that the T0MEANS occurs for every subject at the same time, rather than just at the earliest observation for that subject. Important when modelling trends over time, age, etc. |
stationary |
Logical. If TRUE, T0VAR and T0MEANS input matrices are ignored, the parameters are instead fixed to long run expectations. More control over this can be achieved by instead setting parameter names of T0MEANS and T0VAR matrices in the input model to 'stationary', for elements that should be fixed to stationarity. |
plot |
if TRUE, for sampling, a Shiny program is launched upon fitting to interactively plot samples. May struggle with many (e.g., > 5000) parameters. For optimizing, various optimization details are plotted – in development. |
derrind |
deprecated, latents involved in dynamic error calculations are determined automatically now. |
optimize |
if TRUE, use |
optimcontrol |
list of parameters sent to |
nlcontrol |
List of non-linear control parameters.
|
nopriors |
deprecated, use priors argument. logical. If TRUE, any priors are disabled – sometimes desirable for optimization. |
priors |
if TRUE, priors are included in computations, otherwise specified priors are ignored. |
chains |
used when |
cores |
number of cpu cores to use. Either 'maxneeded' to use as many as available minus one,
up to the number of chains, or a positive integer. If |
inits |
either character string 'optimize, NULL, or vector of (unconstrained)
parameter start values, as returned by the rstan function |
compileArgs |
List of arguments to pass to |
forcerecompile |
logical. For development purposes. If TRUE, stan model is recompiled, regardless of apparent need for compilation. |
saveCompile |
if TRUE and compilation is needed / requested, writes the stan model to the parent frame as ctsem.compiled (unless that object already exists and is not from ctsem), to avoid unnecessary recompilation. |
savescores |
Logical. If TRUE, output from the Kalman filter is saved in output. For datasets with many variables or time points, will increase file size substantially. |
savesubjectmatrices |
Logical. If TRUE, subject specific matrices are saved – only relevant when either time dependent predictors or individual differences are used. Can increase memory usage dramatically in large models, and can be computed after fitting using ctExtract or ctStanSubjectPars . |
saveComplexPars |
Logical. If TRUE, also save rowwise output of any complex parameters specified, i.e. combinations of parameters, functions and states. |
gendata |
Logical – If TRUE, uses provided data for only covariates and a time and missingness structure, and
generates random data according to the specified model / priors.
Generated data is in the $Ygen subobject after running |
control |
Used when |
verbose |
Integer from 0 to 2. Higher values print more information during model fit – for debugging. |
vb |
Logical. Use variational Bayes algorithm from stan? Only kind of working, not recommended. |
... |
additional arguments to pass to |
#generate a ctStanModel relying heavily on defaults
model<-ctModel(type='stanct',
latentNames=c('eta1','eta2'),
manifestNames=c('Y1','Y2'),
MANIFESTVAR=diag(.1,2),
TDpredNames='TD1',
TIpredNames=c('TI1','TI2','TI3'),
LAMBDA=diag(2))
fit<-ctStanFit(ctstantestdat, model,priors=TRUE)
summary(fit)
plot(fit,wait=FALSE)
#### extended examples
library(ctsem)
set.seed(3)
# Data generation (run this, but no need to understand!) -----------------
Tpoints <- 20
nmanifest <- 4
nlatent <- 2
nsubjects<-20
#random effects
age <- rnorm(nsubjects) #standardised
cint1<-rnorm(nsubjects,2,.3)+age*.5
cint2 <- cint1*.5+rnorm(nsubjects,1,.2)+age*.5
tdpredeffect <- rnorm(nsubjects,5,.3)+age*.5
for(i in 1:nsubjects){
#generating model
gm<-ctModel(Tpoints=Tpoints,n.manifest = nmanifest,n.latent = nlatent,n.TDpred = 1,
LAMBDA = matrix(c(1,0,0,0, 0,1,.8,1.3),nrow=nmanifest,ncol=nlatent),
DRIFT=matrix(c(-.3, .2, 0, -.5),nlatent,nlatent),
TDPREDMEANS=matrix(c(rep(0,Tpoints-10),1,rep(0,9)),ncol=1),
TDPREDEFFECT=matrix(c(tdpredeffect[i],0),nrow=nlatent),
DIFFUSION = matrix(c(1, 0, 0, .5),2,2),
CINT = matrix(c(cint1[i],cint2[i]),ncol=1),
T0VAR=diag(2,nlatent,nlatent),
MANIFESTVAR = diag(.5, nmanifest))
#generate data
newdat <- ctGenerate(ctmodelobj = gm,n.subjects = 1,burnin = 2,
dtmat<-rbind(c(rep(.5,8),3,rep(.5,Tpoints-9))))
newdat[,'id'] <- i #set id for each subject
newdat <- cbind(newdat,age[i]) #include time independent predictor
if(i ==1) {
dat <- newdat[1:(Tpoints-10),] #pre intervention data
dat2 <- newdat #including post intervention data
}
if(i > 1) {
dat <- rbind(dat, newdat[1:(Tpoints-10),])
dat2 <- rbind(dat2,newdat)
}
}
colnames(dat)[ncol(dat)] <- 'age'
colnames(dat2)[ncol(dat)] <- 'age'
#plot generated data for sanity
plot(age)
matplot(dat[,gm$manifestNames],type='l',pch=1)
plotvar <- 'Y1'
plot(dat[dat[,'id']==1,'time'],dat[dat[,'id']==1,plotvar],type='l',
ylim=range(dat[,plotvar],na.rm=TRUE))
for(i in 2:nsubjects){
points(dat[dat[,'id']==i,'time'],dat[dat[,'id']==i,plotvar],type='l',col=i)
}
dat2[,gm$manifestNames][sample(1:length(dat2[,gm$manifestNames]),size = 100)] <- NA
#data structure
head(dat2)
# Model fitting -----------------------------------------------------------
##simple univariate default model
m <- ctModel(type = 'stanct', manifestNames = c('Y1'), LAMBDA = diag(1))
ctModelLatex(m)
#Specify univariate linear growth curve
m1 <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
DRIFT=matrix(-.0001,nrow=1,ncol=1),
DIFFUSION=matrix(0,nrow=1,ncol=1),
T0VAR=matrix(0,nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c('merror'),nrow=1,ncol=1))
ctModelLatex(m1)
#fit
f1 <- ctStanFit(datalong = dat2, ctstanmodel = m1, optimize=TRUE, priors=FALSE)
summary(f1)
#plots of individual subject models v data
ctKalman(f1,plot=TRUE,subjects=1,kalmanvec=c('y','yprior'),timestep=.01)
ctKalman(f1,plot=TRUE,subjects=1:3,kalmanvec=c('y','ysmooth'),timestep=.01,errorvec=NA)
ctStanPostPredict(f1, wait=FALSE) #compare randomly generated data from posterior to observed data
cf<-ctCheckFit(f1) #compare mean and covariance of randomly generated data to observed cov
plot(cf,wait=FALSE)
### Further example models
#Include intervention
m2 <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
n.TDpred=1,TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix(-1e-5,nrow=1,ncol=1),
DIFFUSION=matrix(0,nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix(0,nrow=1,ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c('merror'),nrow=1,ncol=1))
#Individual differences in intervention, Bayesian estimation, covariates
m2i <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
TIpredNames = 'age',
TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect||TRUE'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix(-1e-5,nrow=1,ncol=1),
DIFFUSION=matrix(0,nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix(0,nrow=1,ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c('merror'),nrow=1,ncol=1))
#Including covariate effects
m2ic <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
n.TIpred = 1, TIpredNames = 'age',
n.TDpred=1,TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix(-1e-5,nrow=1,ncol=1),
DIFFUSION=matrix(0,nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix(0,nrow=1,ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c('merror'),nrow=1,ncol=1))
m2ic$pars$indvarying[m2ic$pars$matrix %in% 'TDPREDEFFECT'] <- TRUE
#Include deterministic dynamics
m3 <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
n.TDpred=1,TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix('drift11',nrow=1,ncol=1),
DIFFUSION=matrix(0,nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix('t0var11',nrow=1,ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c('merror1'),nrow=1,ncol=1))
#Add system noise to allow for fluctuations that persist in time
m3n <- ctModel(type = 'stanct',
manifestNames = c('Y1'), latentNames=c('eta1'),
n.TDpred=1,TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix('drift11',nrow=1,ncol=1),
DIFFUSION=matrix('diffusion',nrow=1,ncol=1),
CINT=matrix(c('cint1'),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix('t0var11',nrow=1,ncol=1),
LAMBDA = diag(1),
MANIFESTMEANS=matrix(0,ncol=1),
MANIFESTVAR=matrix(c(0),nrow=1,ncol=1))
#include 2nd latent process
m4 <- ctModel(n.manifest = 2,n.latent = 2, type = 'stanct',
manifestNames = c('Y1','Y2'), latentNames=c('L1','L2'),
n.TDpred=1,TDpredNames = 'TD1',
TDPREDEFFECT=matrix(c('tdpredeffect1','tdpredeffect2'),nrow=2,ncol=1),
DRIFT=matrix(c('drift11','drift21','drift12','drift22'),nrow=2,ncol=2),
DIFFUSION=matrix(c('diffusion11','diffusion21',0,'diffusion22'),nrow=2,ncol=2),
CINT=matrix(c('cint1','cint2'),nrow=2,ncol=1),
T0MEANS=matrix(c('t0m1','t0m2'),nrow=2,ncol=1),
T0VAR=matrix(c('t0var11','t0var21',0,'t0var22'),nrow=2,ncol=2),
LAMBDA = matrix(c(1,0,0,1),nrow=2,ncol=2),
MANIFESTMEANS=matrix(c(0,0),nrow=2,ncol=1),
MANIFESTVAR=matrix(c('merror1',0,0,'merror2'),nrow=2,ncol=2))
#dynamic factor model -- fixing CINT to 0 and freeing indicator level intercepts
m3df <- ctModel(type = 'stanct',
manifestNames = c('Y2','Y3'), latentNames=c('eta1'),
n.TDpred=1,TDpredNames = 'TD1', #this line includes the intervention
TDPREDEFFECT=matrix(c('tdpredeffect'),nrow=1,ncol=1), #intervention effect
DRIFT=matrix('drift11',nrow=1,ncol=1),
DIFFUSION=matrix('diffusion',nrow=1,ncol=1),
CINT=matrix(c(0),ncol=1),
T0MEANS=matrix(c('t0m1'),ncol=1),
T0VAR=matrix('t0var11',nrow=1,ncol=1),
LAMBDA = matrix(c(1,'Y3loading'),nrow=2,ncol=1),
MANIFESTMEANS=matrix(c('Y2_int','Y3_int'),nrow=2,ncol=1),
MANIFESTVAR=matrix(c('Y2residual',0,0,'Y3residual'),nrow=2,ncol=2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.