initialize: Palytic class generator.

initializeR Documentation

Palytic class generator.

Description

Create a new Palytic Object

Usage

Palytic$new(data, ids, dv, time)

Arguments

data

A data.frame that contains as variables ids, dv, phase, and time. Optionally, additional independent variables can be included in ivs. fixed and random formulae for lme models and formula for gamlss models are automatically generated when a Palytic object is created if these fields are left NULL.

ids

A character string giving the name of the id variable in data.

dv

A character string giving the name of the dependent variable in data.

time

A character string giving the name of the time variable in data. Random slopes for time are included by default. This can be overridden by specifying fixed and random formula for lme models or by specifying the formula for gamlss models.

phase

A character string giving the name of the phase variable in data. The phase*time interaction is included by default. This can be overridden by specifying fixed and random formula for lme models or by specifying the formula for gamlss models.

ivs

A list of one or more character strings giving the names of additional variables in data, e.g., list('iv2', 'iv2').

interactions

List of vector pairs of variable names for which interaction terms should be specified, e.g., list(c('time', 'phase'), c('time', 'iv1'), c('iv1', 'iv2')) where 'iv1' is the name of a variable in the list ivs.

time_power

The polynomial for time, e.g., time^time_power. Fixed effects for time^1...time^time_power will be included in models. Future releases will allow for other functions of time such as sin, but these can be applied directly by transforming the time variable.

correlation

See corStruct. Defaults to NULL, see lme. Used by both lme and gamlss models.

correlation0

Other options such as autoSelect can change correlation, correlation0 retains the original value provided by the user.

family

The gamlss.family distribution.

fixed

The fixed effects model for lme models.

random

The random effects model for lme models.

formula

The formula effects model for gamlss models. sigma.formula, nu.formula, and tau.formula will be implemented in a future release.

method

See method in lme. Is usef for both lme and gamlss models.

standardize

Named logical list. Which variables should be standardized? The default is list(dv=FALSE, ivs=FALSE, byids=FALSE). See dv and ivs. The option byids controls whether standardization is done by individuals or by group. Any time variables are changed (e.g., ivs), the data are subset, or the options in standardize are changed, the raw data will be restandardized (see datac).

autoSelect

List. The default is list(AR=list(P=3, Q=3) , TO=list(polyMax=3) , DIST=list()) .

If no automated model selection for the residual covariance structure (AR), the polynomial order for the relationship between time and the dependent variable (TO), or the dependent variable distribution is desired, an empty list should be passed (e.g., autoSelect=list()).

If AR is in the list, the residual correlation structure will be automatically selected from among ARMA(p,q) models. See correlation. Since these models are not generally nested, model selection is done using information information criterion (see whichIC). Model selection for the residual covariance structure is searches among p=1,...,P and p=1,...,Q, where P and Q are taken from PQ, i.e., PQ=c(P,Q). The values of p and p are passed to corARMA ( e.g., corARMA(p=p,q=q)). If individual_mods=FALSE, this done comparing lme modes for N>1 data. If individual_mods=TRUE, this is done using the auto.arima function on the residuals for each individual. For more detail, see the $GroupAR() method.

If TO is in the list, models with polynomial powers of time from 1 to polyMax will be tested. For example, if polyMax=3 (implying a cubic growth model), the models compared include time, time + I(time^2), and time + I(time^2)+I(time^3). Since these models are nested, the best fitting model is selected using likelihood ratio tests with mixed effects models fit using maximum likelihood estimators in lme. This is done separately for each individual in ids if individual_mods=TRUE. For more detail, see the $getTO() method.

If DIST is in the list and package='gamlss', each dependent variable in dvs will utilize the fitDist function of the gamlss package, and the best fitting distribution will be used for each dependent variable. For more detail, see the $dist() method in Palytic.

whichIC

Character. The default is whichIC="BIC".

Either the Akaike Information Criterion (whichIC="AIC") or the Bayesian Information Criterion (whichIC="BIC").

If the time variable is equally spaced, this is done using the function forecast. If the time variable is not equally spaced, this is done using comparisons of mixed effects models using lme fit using maximum likelihood estimators.

Residual autocorrelation structure is detected separately for each individual in ids if individual_mods=TRUE.

corStructs

Vector. A correlation structure for each case in ids. Not user accessible. Populated by PersonAlytic.

time_powers

Vector. A time_order for each case in ids. Not user accessible. Populated by PersonAlytic.

alignPhase

Character. Options include a. 'none', no changes are made to the time or phase variable. b. 'align', align the time variable to be zero at the transition between the first and second phase (see alignPhases). c. 'piecewise', add 'pwtime#' variables, which will replace time and time_power to create a piecewise linear growth curve model, and where '#' is the number of phases (i.e., one linear growth curve model per phase).

ismonotone

Logical. Is the time variable for each case monotonically increasing (i.e., no returns to prior values). This is determining in data cleaning as described for datac.

datac

data.frame. Cleaned data. Cleaning involves the following steps: 1. Check that the variables in ids, dv, time, phase, ivs, and interactions are in data. 2. The ids variable is forced to be numeric. 3. The validity of the correlation structure is checked, see corStruct. 4. check that variables have non-zero variance. 5. If standardization is requested, standardize the data (see standardize). 6. Sort the data on ids and time. 7. If patients have < 2 observations, they are dropped from the data set. 8. Phase alignment (if any, see alignPhase).

debugforeach

Logical flag for testing error handling in parallelized runs.

try_silent

Logical flag for testing error handling in Palytic methods.

Details

The fields data, ids, dv, and time are required. Using these, the default model dv=time with random intercepts for ids and random intercepts for time is constructed. See the example. If phase is provided, the default model is dv=time+phase+phase*time, and if ivs are provided they are included in the model.

Value

A new 'Palytic' object

Author(s)

Stephen Tueller stueller@rti.org

Examples


## Not run: 

# construct a new Payltic object and examine the default formulae#'
t1 <- Palytic$new(data = OvaryICT, ids='Mare', dv='follicles',
                  time='Time', phase='Phase', autoSelect=list())

# summary, descriptive, and plot methods
t1$summary()
t1$describe()
t1$plot()

# check the formulae creation
t1$fixed
t1$random
t1$formula
t1$correlation

# Compare gamlss and lme output, in which the models of the default formulae
# are fit. Note that the estimates are the same (within rounding) but that
# the gamlss SE are much smaller. This is due to gamlss modeling the variance
# which reduces the redisudual variance
t1.gamlss <- t1$gamlss()
t1.lme    <- t1$lme()
t1.gamlss$tTable
t1.lme$tTable

# now change the correlation structure and compare gamlss and lme output,
# noting that the intercepts are very different now
t1$correlation <- "corARMA(p=1, q=0)"
t1$gamlss()$tTable
t1$lme()$tTable

# fit the model only to the first mare with ML instead of REML
t1$method <- 'ML'
t1$gamlss(OvaryICT$Mare==1)$tTable

# change the formula
t2 <- t1$clone()
t2$formula <- formula(follicles ~ Time * Phase +
                      re(random = ~Time + I(Time^2) | Mare, method = "ML",
                      correlation = corARMA(p=1,q=0)))
t2$formula

# random intercept only model
t2 <- t1$clone()
t2$random <- formula(~1|Mare)
t2$random
t2$formula

# random slope only model
t2$random <- formula(~0+Time|Mare)
t2$random
t2$formula

# note that prior examples set
# `autoSelect=list()`, here we use the default, which is to autoselect
# the correlation structure (AR), the polynomial order of time (TO) and
# the distribution
t1 <- Palytic$new(data = OvaryICT, ids='Mare',
                  dv='follicles', time='Time', phase='Phase',
                  autoSelect=list(AR=list(P=3, Q=3)     ,
                               TO=list(polyMax=3)    ,
                               DIST=list())  )

# automatically select the polynomial order of time with getTO
t1$getTO()
t1$time_powers

# automatically select the ARMA model for residual correlation getAR
t1$GroupAR()
t1$correlation
t1$corStructs

# automatically select the distribution, noting that calling $dist() updates $family
t1$dist()
t1$family

# automatically select all three which follows the order
# 1. DIST (which will switch package to gamlss for TO and AR)
# 2. TO (which can subsequently impact AR)
# 3. AR
t1$detect()

# construct a new Payltic object with no phase variable
t1 <- Palytic$new(data = OvaryICT, ids='Mare', dv='follicles',
                  time='Time', phase=NULL)
t1$plot()

# piecewise example
OvaryICT$TimeP <- round(30*OvaryICT$Time)
t1 <- Palytic$new(data = OvaryICT, ids = 'Mare',
                  dv = 'follicles', time = 'TimeP', phase = 'Phase',
                  alignPhase = 'piecewise', autoSelect=list())
t1$time
t1$lme()

# piecewise with finite population correction for a population of N=200
t1$lme()$tTable
t1$lme(fpc = TRUE, popsize2 = 200)$FPCtTable


## End(Not run)

ICTatRTI/PersonAlytics documentation built on Dec. 13, 2024, 11:06 p.m.