TEfit | R Documentation |
This is the primary function for the TEfits
package (but see TEbrm
for
a more powerful approach). Fits a
time-evolving regression model. Many options are available for
various error functions, functional forms of change,
nested timescales, bootstrapping/subsampling/cross-validation, and so on.
Various handy S3 methods are available, such as
plot
, summary
,
coef
, and simulate
.
TEfit(
varIn,
linkFun = list(link = "identity"),
errFun = "ols",
changeFun = "expo",
bootPars = tef_bootList(),
blockTimeVar = NULL,
covarTerms = list(),
control = tef_control()
)
varIn |
Data frame or vector. First column [or vector] must be the time-dependent response variable (left hand side of regression). If available, second column must be the time variable. All other columns are covariates, possibly involved in a link function. |
linkFun |
A list defining a link function (i.e., 'identity', 'd_prime', 'weibull', or 'logistic') |
errFun |
A string defining an error function (e.g., 'ols', 'logcosh', 'bernoulli'). |
changeFun |
A string defining the functional form of change (e.g., 'expo', 'power', 'weibull') |
bootPars |
A list defining the details for bootstrapped fits. Defaults to no bootstrapping. Necessary for estimates of uncertainty around fits and for covariance between parameters. |
blockTimeVar |
A string identifying which covariate is the time points of sub-scales (e.g., "blocks" of times within the overall timescale of data collection) |
covarTerms |
An optional list of logical vectors indicating whether parameters should vary by covariates. See examples. |
control |
A list of model parameters. Use of tef_control() is highly recommended. |
TEfit defines a nonlinear regression model and re-fits that model
using optim
numerous times, with parameter values randomly initialized prior to optimization, until
the highest-likelihood fitting runs also have parameters very similar to
one another (i.e., SD less than the convergence criterion). Runs are
implemented in batches of 10. Convergence is a heuristic and should ideally be
corroborated with other measures (e.g., bootstrapping).
Bootstrapping or subsampling is specified as
bootPars=tef_bootList(resamples = 0, bootPercent = 1, bootTries = 20)
.
resamples
refers to the number of times the model is re-fit on resampled data,
bootPercent
is the proportion (between 0 and 1) of the data resampled, and
bootTries
is the number of optimization runs attempted on each subsample.
bootPercent
of 1, the default, implements resampling with replacement (bootstrapping).
bootPercent
less than 1 implements resampling without replacement, fitting
the model to that subsample, and evaluation of the fit values on the
left-out subsample (i.e., cross-validation). bootTries
defaults to a very small number (20).
Currently supported error functions are:
ols
, i.e. sum((y-yHat)^2)
– sum of squared error
rmse
, i.e. sqrt(mean((y-yHat)^2))
– root mean squared error
logcosh
, i.e. sum(log(cosh(y-yHat)))
– log-hyperbolic-cosine
bernoulli
, i.e. -sum(y*log(yHat) + (1-y)*log(1-yHat))
– Bernoulli [binary binomial]
exGauss_mu
, i.e. -sum(log(retimes::dexgauss(y,mu=yHat,sigma=sigma_param,tau=tau_param)))
–
ex-Gaussian distribution with time-evolving change in the Gaussian mean parameter
exGauss_tau
, i.e. -sum(log(retimes::dexgauss(y,mu=mu_param,sigma=sigma_param,tau=yHat)))
–
ex-Gaussian distribution with time-evolving change in the tau parameter
Currently supported link functions are:
identity
– implemented by linkFun=list(link='identity')
–
Default. The predicted values of the time-evolving function are the predicted
values of the model.
logit
– implemented by
linkFun=list(link='logit',logistX='variableName',
threshChange=T,biasChange=F,fitThresh=.75,lapseRate=.005)
– The predicted values of the time-evolving function are threshold values
of logistX
(by default) and/or
the bias value (defaults to constant). link
and logistX
are required.
Other parameters have default values
(i.e., modelled value of the outcome variable [fitThresh
],
offset of the predicted values to prevent a pathological error calculation [lapseRate
])
weibull
– implemented by
linkFun=list(link='weibull',weibullX='variableName',
fitThresh=.75,yIntercept=.5,rhAsymptote=1,lapseRate=.005)
– The predicted values of the time-evolving function are threshold values
of weibullX
.
link
and weibullX
are required. Other parameters
have default values
(i.e., modelled value of the outcome variable [fitThresh
],
value of the outcome variable at weibullX==0 [yIntercept
],
value of the outcome variable at weibullX==Inf [rhAsymptote
],
offset of the predicted values to prevent a pathological error calculation [lapseRate
])
.
d_prime
– implemented by
linkFun=list(link='d_prime',presence='variableName',
max_d_prime=5,smooth_hwhm=3)
– link
and presence
are required, max_d_prime
and smooth_hwhm
have default values. The pFA and pH are
first calculated using a windowed average of stimulus-present or
stimulus-absent trials (penalized to bound the max d-prime), then calculating the
by-timepoint d-prime, then fitting that d-prime as the response variable using an identity link. See
tef_acc2dprime
and tef_runningMean
for details about the intermediate steps.
Currently supported change functions are:
expo
– 3-parameter exponential (start, [inverse] rate, and asymptote) – rate is log of time to some proportion remaining, default is log2 of time to 50 percent remaining
expo_block
– 3-parameter exponential (start, [inverse] rate, and asymptote)
plus 2-paramter multiplicative changes on timescales that are a subset of the whole
expo_double
– 4-parameter exponential (start, two equally weighted [inverse] rates, and asymptote)
power
– 3-parameter power (start, [inverse] rate, and asymptote) – rate is log of time to some proportion remaining, defaulting to log2 of time to 25 percent remaining
power4
– 4-parameter power (start, [inverse] rate, asymptote, and "previous learning time")
weibull
– 4-parameter weibull (start, [inverse] rate, asymptote, and shape) – rate is same as expo
A TEfit
S3 object including:
model
The best fit of parameters to the dataset, along with associated items such as tests of goodness-of-fit and nonstationarity
nullFit
The best fit of a model that does not change over time, but otherwise uses the same parameterization as model
data
Data frame of the input variables fit by the model, with the additiona of fit values
modList
List of model details
bootList
(if relevant) List of bootstrapped TEfit
models.
By default, the mean of the time-evolving model's fit values should be very similar to the mean of the null fit values.
This is enforced by penalizing the time-evolving model's error multiplicatively by 1 + the square of the difference
between the average of the model prediction and the average of the null [non-time-evolving] prediction. This is intended to
constrain model predictions to a "sane" range. This constraint can be removed with control=tef_control(penalizeMean=F)
.
Currently known bugs are:
The logistic linkfun must include threshChange=T. Returns error if only the bias term is allowed to change.
For interpreting model outputs: plot.TEfit
; summary.TEfit
;
coef.TEfit
; simulate.TEfit
TEfitAll
for fitting a set of TEfit
models.
TEbrm
for fitting a Bayesian regression model, with many options
including fixed and random effects.
For including a nonlinear time predictor in [generalized] linear regression frameworks:
TElm
; TEglm
; TElmem
; TEglmem
## Not run:
## example data:
dat <- data.frame(timeVar = 1:50, respVar = c(seq(.3,.9,length=25),seq(.9,.91,length=25))+rep(c(0,.01),25),covar1=rep(c(1,2),25),covar2 = rep(c(-3,-1,0,2,5),10))
## Default fitting of 'ols' error function and 3-parameter exponential change function:
m <- TEfit(dat[,c('respVar','timeVar')])
summary(m)
# view a plot of the model:
plot(m)
## 'bernoulli' error function:
m <- TEfit(dat[,c('respVar','timeVar')],errFun='bernoulli')
summary(m)
## 3-parameter power change function:
m <- TEfit(dat[,c('respVar','timeVar')],changeFun='power')
summary(m)
# view a plot of the model:
plot(m)
## logistic threshold change:
m <- TEfit(dat[,c('respVar','timeVar','covar2')],errFun='bernoulli',linkFun=list(link='logit',logistX='covar2'))
## include 2 covariates (on all 3 parameters by default):
m <- TEfit(dat[,c('respVar','timeVar','covar1','covar2')])
summary(m) # (likely does not converge due to too many [nonsense] covariates)
plot(m)
## include 2 covariates:
## asymptote and rate are affected by covar1, start and rate are affected by covar2
m <- TEfit(dat[,c('respVar','timeVar','covar1','covar2')],covarTerms=list(pStart=c(F,T),pRate=c(T,T),pAsym=c(T,F)))
## 50 bootstrapped fits:
m <- TEfit(dat[,c('respVar','timeVar')],bootPars=tef_bootList(resamples=50))
summary(m)
# view a plot of the model, with CI bands:
plot(m)
# view the predicted values of the model by plotting data simulated from the parameters:
. <- simulate(m,toPlot=T)
## 50 random-subsample 80/20 cross-validation fits:
m <- TEfit(dat[,c('respVar','timeVar')],bootPars=tef_bootList(resamples=50,bootPercent=.8))
summary(m)
## ## ## control parameters:
## Increase convergence tolerance to 0.1:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(convergeTol=.1))
## Increase the maximum run number to 5000 (defaults to 200):
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(nTries=5000))
## If the function will asymptote in the given time period, then one option is to calculate the TE function stepwise: first get a stable fit of last 20% of timepoints (if there are enough timepoints, average this with a stable fit to the last 10% of timepoints). Then fit the start and rate of approach to this asymptote:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(stepwise_asym = T))
## Put limits on the predicted values:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(y_lim=c(.1,.9)))
## Put limits on the rate parameter:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(rate_lim=c(2,4)))
## Remove the constraint that the time-evolving fit values should have the same mean as the null fit values:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(penalizeMean=F))
## If rate parameter is hitting the boundary, try imposing a slight penalization for extreme rate values:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(penalizeRate=T))
## Change the exponential change log base from 2 to e:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(expBase=exp(1)))
## Change the rate parameter log base from 2 to e:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(rateBase=exp(1)))
## Silence errors:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(quietErrs=T))
## Fix a parameter [asymptote] to 0.8:
m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(pFix=list(pAsym=.8)))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.