TEfit: Fit a time-evolving model (nonlinear regression by minimizing...

View source: R/TEfit.R

TEfitR Documentation

Fit a time-evolving model (nonlinear regression by minimizing error)

Description

This is the primary function for the TEfits package (but see TEbrm for a more powerful approach). Fits a time-evolving regression model. Many options are available for various error functions, functional forms of change, nested timescales, bootstrapping/subsampling/cross-validation, and so on. Various handy S3 methods are available, such as plot, summary, coef, and simulate.

Usage

TEfit(
  varIn,
  linkFun = list(link = "identity"),
  errFun = "ols",
  changeFun = "expo",
  bootPars = tef_bootList(),
  blockTimeVar = NULL,
  covarTerms = list(),
  control = tef_control()
)

Arguments

varIn

Data frame or vector. First column [or vector] must be the time-dependent response variable (left hand side of regression). If available, second column must be the time variable. All other columns are covariates, possibly involved in a link function.

linkFun

A list defining a link function (i.e., 'identity', 'd_prime', 'weibull', or 'logistic')

errFun

A string defining an error function (e.g., 'ols', 'logcosh', 'bernoulli').

changeFun

A string defining the functional form of change (e.g., 'expo', 'power', 'weibull')

bootPars

A list defining the details for bootstrapped fits. Defaults to no bootstrapping. Necessary for estimates of uncertainty around fits and for covariance between parameters.

blockTimeVar

A string identifying which covariate is the time points of sub-scales (e.g., "blocks" of times within the overall timescale of data collection)

covarTerms

An optional list of logical vectors indicating whether parameters should vary by covariates. See examples.

control

A list of model parameters. Use of tef_control() is highly recommended.

Details

TEfit defines a nonlinear regression model and re-fits that model using optim numerous times, with parameter values randomly initialized prior to optimization, until the highest-likelihood fitting runs also have parameters very similar to one another (i.e., SD less than the convergence criterion). Runs are implemented in batches of 10. Convergence is a heuristic and should ideally be corroborated with other measures (e.g., bootstrapping).

Bootstrapping or subsampling is specified as bootPars=tef_bootList(resamples = 0, bootPercent = 1, bootTries = 20). resamples refers to the number of times the model is re-fit on resampled data, bootPercent is the proportion (between 0 and 1) of the data resampled, and bootTries is the number of optimization runs attempted on each subsample. bootPercent of 1, the default, implements resampling with replacement (bootstrapping). bootPercent less than 1 implements resampling without replacement, fitting the model to that subsample, and evaluation of the fit values on the left-out subsample (i.e., cross-validation). bootTries defaults to a very small number (20).

Currently supported error functions are:

  • ols, i.e. sum((y-yHat)^2) – sum of squared error

  • rmse, i.e. sqrt(mean((y-yHat)^2)) – root mean squared error

  • logcosh, i.e. sum(log(cosh(y-yHat))) – log-hyperbolic-cosine

  • bernoulli, i.e. -sum(y*log(yHat) + (1-y)*log(1-yHat)) – Bernoulli [binary binomial]

  • exGauss_mu, i.e. -sum(log(retimes::dexgauss(y,mu=yHat,sigma=sigma_param,tau=tau_param))) – ex-Gaussian distribution with time-evolving change in the Gaussian mean parameter

  • exGauss_tau, i.e. -sum(log(retimes::dexgauss(y,mu=mu_param,sigma=sigma_param,tau=yHat))) – ex-Gaussian distribution with time-evolving change in the tau parameter

Currently supported link functions are:

  • identity – implemented by linkFun=list(link='identity') – Default. The predicted values of the time-evolving function are the predicted values of the model.

  • logit – implemented by linkFun=list(link='logit',logistX='variableName', threshChange=T,biasChange=F,fitThresh=.75,lapseRate=.005) – The predicted values of the time-evolving function are threshold values of logistX (by default) and/or the bias value (defaults to constant). link and logistX are required. Other parameters have default values (i.e., modelled value of the outcome variable [fitThresh], offset of the predicted values to prevent a pathological error calculation [lapseRate])

  • weibull – implemented by linkFun=list(link='weibull',weibullX='variableName', fitThresh=.75,yIntercept=.5,rhAsymptote=1,lapseRate=.005) – The predicted values of the time-evolving function are threshold values of weibullX. link and weibullX are required. Other parameters have default values (i.e., modelled value of the outcome variable [fitThresh], value of the outcome variable at weibullX==0 [yIntercept], value of the outcome variable at weibullX==Inf [rhAsymptote], offset of the predicted values to prevent a pathological error calculation [lapseRate]) .

  • d_prime – implemented by linkFun=list(link='d_prime',presence='variableName', max_d_prime=5,smooth_hwhm=3)link and presence are required, max_d_prime and smooth_hwhm have default values. The pFA and pH are first calculated using a windowed average of stimulus-present or stimulus-absent trials (penalized to bound the max d-prime), then calculating the by-timepoint d-prime, then fitting that d-prime as the response variable using an identity link. See tef_acc2dprime and tef_runningMean for details about the intermediate steps.

Currently supported change functions are:

  • expo – 3-parameter exponential (start, [inverse] rate, and asymptote) – rate is log of time to some proportion remaining, default is log2 of time to 50 percent remaining

  • expo_block – 3-parameter exponential (start, [inverse] rate, and asymptote) plus 2-paramter multiplicative changes on timescales that are a subset of the whole

  • expo_double – 4-parameter exponential (start, two equally weighted [inverse] rates, and asymptote)

  • power – 3-parameter power (start, [inverse] rate, and asymptote) – rate is log of time to some proportion remaining, defaulting to log2 of time to 25 percent remaining

  • power4 – 4-parameter power (start, [inverse] rate, asymptote, and "previous learning time")

  • weibull – 4-parameter weibull (start, [inverse] rate, asymptote, and shape) – rate is same as expo

Value

A TEfit S3 object including:

model

The best fit of parameters to the dataset, along with associated items such as tests of goodness-of-fit and nonstationarity

nullFit

The best fit of a model that does not change over time, but otherwise uses the same parameterization as model

data

Data frame of the input variables fit by the model, with the additiona of fit values

modList

List of model details

bootList

(if relevant) List of bootstrapped TEfit models.

Note

By default, the mean of the time-evolving model's fit values should be very similar to the mean of the null fit values. This is enforced by penalizing the time-evolving model's error multiplicatively by 1 + the square of the difference between the average of the model prediction and the average of the null [non-time-evolving] prediction. This is intended to constrain model predictions to a "sane" range. This constraint can be removed with control=tef_control(penalizeMean=F).

Currently known bugs are:

  • The logistic linkfun must include threshChange=T. Returns error if only the bias term is allowed to change.

See Also

For interpreting model outputs: plot.TEfit; summary.TEfit; coef.TEfit; simulate.TEfit

TEfitAll for fitting a set of TEfit models.

TEbrm for fitting a Bayesian regression model, with many options including fixed and random effects.

For including a nonlinear time predictor in [generalized] linear regression frameworks: TElm; TEglm; TElmem; TEglmem

Examples

## Not run: 
## example data:
dat <- data.frame(timeVar = 1:50, respVar = c(seq(.3,.9,length=25),seq(.9,.91,length=25))+rep(c(0,.01),25),covar1=rep(c(1,2),25),covar2 = rep(c(-3,-1,0,2,5),10))

## Default fitting of 'ols' error function and 3-parameter exponential change function:
m <- TEfit(dat[,c('respVar','timeVar')])
summary(m)
# view a plot of the model:
plot(m)

## 'bernoulli' error function:
m <- TEfit(dat[,c('respVar','timeVar')],errFun='bernoulli')
summary(m)

## 3-parameter power change function:
m <- TEfit(dat[,c('respVar','timeVar')],changeFun='power')
summary(m)
# view a plot of the model:
plot(m)

## logistic threshold change:
m <- TEfit(dat[,c('respVar','timeVar','covar2')],errFun='bernoulli',linkFun=list(link='logit',logistX='covar2'))

## include 2 covariates (on all 3 parameters by default):
m <- TEfit(dat[,c('respVar','timeVar','covar1','covar2')])
summary(m) # (likely does not converge due to too many [nonsense] covariates)
plot(m)

## include 2 covariates:
## asymptote and rate are affected by covar1, start and rate are affected by covar2
m <- TEfit(dat[,c('respVar','timeVar','covar1','covar2')],covarTerms=list(pStart=c(F,T),pRate=c(T,T),pAsym=c(T,F)))

## 50 bootstrapped fits:
 m <- TEfit(dat[,c('respVar','timeVar')],bootPars=tef_bootList(resamples=50))
 summary(m)
# view a plot of the model, with CI bands:
plot(m)
# view the predicted values of the model by plotting data simulated from the parameters:
. <- simulate(m,toPlot=T)

 ## 50 random-subsample 80/20 cross-validation fits:
 m <- TEfit(dat[,c('respVar','timeVar')],bootPars=tef_bootList(resamples=50,bootPercent=.8))
 summary(m)

 ## ## ## control parameters:

 ## Increase convergence tolerance to 0.1:
 m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(convergeTol=.1))

 ## Increase the maximum run number to 5000 (defaults to 200):
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(nTries=5000))

 ## If the function will asymptote in the given time period, then one option is to calculate the TE function stepwise: first get a stable fit of last 20% of timepoints (if there are enough timepoints, average this with a stable fit to the last 10% of timepoints). Then fit the start and rate of approach to this asymptote:
 m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(stepwise_asym = T))

 ## Put limits on the predicted values:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(y_lim=c(.1,.9)))

 ## Put limits on the rate parameter:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(rate_lim=c(2,4)))

 ## Remove the constraint that the time-evolving fit values should have the same mean as the null fit values:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(penalizeMean=F))

 ## If rate parameter is hitting the boundary, try imposing a slight penalization for extreme rate values:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(penalizeRate=T))

 ## Change the exponential change log base from 2 to e:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(expBase=exp(1)))

 ## Change the rate parameter log base from 2 to e:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(rateBase=exp(1)))

 ## Silence errors:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(quietErrs=T))

 ## Fix a parameter [asymptote] to 0.8:
  m <- TEfit(dat[,c('respVar','timeVar')],control=tef_control(pFix=list(pAsym=.8)))

## End(Not run)

akcochrane/TEfits documentation built on June 12, 2025, 11:10 a.m.