ctmaShapeRawData: ctmaShapeRawData

View source: R/ctmaShapeRawData.R

ctmaShapeRawDataR Documentation

ctmaShapeRawData

Description

Raw data objects are re-shaped (dealing with missing time points, wrong time intervals etc)

Usage

ctmaShapeRawData(
  dataFrame = NULL,
  id = NULL,
  inputDataFrameFormat = NULL,
  inputTimeFormat = "time",
  missingValues = NA,
  n.manifest = NULL,
  manifest.per.latent = NULL,
  Tpoints = NULL,
  allInputVariablesNames = NULL,
  orderInputVariablesNames = NULL,
  targetInputVariablesNames = NULL,
  targetInputTDpredNames = NULL,
  targetInputTIpredNames = NULL,
  targetTimeVariablesNames = NULL,
  outputDataFrameFormat = "long",
  outputVariablesNames = "Y",
  outputTDpredNames = NULL,
  outputTIpredNames = NULL,
  outputTimeVariablesNames = "time",
  outputTimeFormat = "time",
  scaleTime = 1,
  minInterval = 1e-04,
  minTolDelta = NULL,
  maxTolDelta = NULL,
  negTolDelta = FALSE,
  min.val.n.Vars = 1,
  min.val.Tpoints = 1,
  standardization = "none"
)

Arguments

dataFrame

an R object containing data

id

the identifier of subjects if data are in long format

inputDataFrameFormat

"wide" or "long"

inputTimeFormat

"time" (default) or "delta"

missingValues

Missing value indicator, e.g., -999 or NA (default)

n.manifest

Number of process variables (e.g, 2 in a bivariate model)

manifest.per.latent

n.manifest per latent factor. Frequently 1 manifest per latent, but e.g. c(2,3,1) also possible for 6 manifest loading on 3 latents

Tpoints

Number of time points in the data frame

allInputVariablesNames

vector of all process variable names, time dependent predictor names, time independent predictor names, and names of times/deltas. Only required if the dataFrame does not have column names.

orderInputVariablesNames

= "names" vs "time" (e.g., names: X1, X2, X3, Y1, Y2, X3 vs time: X1, Y1, X2, Y2, ... ). For ctsem/CoTiMA, the output file will order by time.

targetInputVariablesNames

= the process variables in the dataFrame that should be used (in "names" or in "times" order; e.g., c("X1", "X3", "Y1", "X3") ). This is used to delete variables from the data frame that are not required.

targetInputTDpredNames

The actual time dependent (TD) predictor variable names, e.g, 3, or 6, or 9, ... names if Tpoints = 3. Internally, each of the 3, 6, etc represents one TDpred. One typically does NOT have TD predictors in a CoTiMA.

targetInputTIpredNames

time independet (TI) predictor names names in the dataFrame. One typically does NOT have TI predictors in CoTiMA except it uses raw data only, where TIpreds are avalaible for individual cases.

targetTimeVariablesNames

The time variables names in the dataFrame. They also define which Tpoints will be included in the output file , e.g., c("Time4", "Time9").

outputDataFrameFormat

"long" (default) or "wide"

outputVariablesNames

"Y" (default; creates Y1_T0, Y2_T0, Y1_T1, Y2_T1, etc.), but can also be, e.g., c("X", "Y"; creates X_T0, Y_T0, X_T1, Y_T1, etc.).

outputTDpredNames

Will become "TD" if not specified

outputTIpredNames

Will become "TI" if not specified

outputTimeVariablesNames

"time" (default)

outputTimeFormat

"time" (default) or "delta"

scaleTime

A scalar that is used to multiply the time variable. Typical use is rescaling primary study time to the time scale use in other primary studies. For example, scaleTime=1/(60 x 60 x 24 x 365.25) rescales time provided in seconds (frequent case when imported from SPSS) into years (60sec x 60min x 24hrs x 365.25days incl. leap years).

minInterval

A parameter (default = 0.0001) supplied to ctIntervalise. Set to smaller values than any possible observed measurement interval, but larger than 0.0001. The value is used for indicating unavailable time interval information (caused by missing values) because NA is technically not possible for time intervals.

minTolDelta

Set, e.g. to 1/24, to delete variables from time points that are too close (e.g., 1hr; or even before) after another time point. Could be useful to delete values generated by unreliable responding, e.g., in diary studies. Note that minTolDelta applies to the time intervals AFTER the scaleTime argument has applied (i.e., scaleTime may need adaptation for each primary study, but minTolDelta does not).

maxTolDelta

Set, e.g., to 7, to delete variables from time points that are too far after another time point (e.g., 7 days, if all participants should have responed within a week). Note that maxTolDelta applies to the time intervals AFTER the scaleTime argument has applied (i.e., scaleTime may need adaptation for each primary study, but minTolDelta does not).

negTolDelta

FALSE (default) or TRUE. Delete entire cases that have at least one negative delta ('unreliable responding'; use minTolDelta to delete certain variables only)

min.val.n.Vars

min.val.n.Vars = Minimum no. of valid variables. Default = 1 (retaines cases with only 1 valid variable), 0 would retain cases will all variables missing (not very useful). Retaining participants who provide a single valid variable is technically possible, but these participants contribute to the estimation of the variance/mean of this variable only. Since variance/mean are 1/0 in most CoTiMA applications, this is not very informative but at the cost of additional computational burden. Setting min.val.n.Vars = 2 is recommended.

min.val.Tpoints

Minimum no. of valid Tpoints (i.e. Tpoints where min.val.n.Vars is met). Default = 1 retains participants with full set of valid variables at least at one single Tpoint (which will become T0). Setting min.val.Tpoints = 2 or higher values retains participants which provide longitudinal information. Since T0 covariances are usually not too interesting, min.val.Tpoints = 2 may be more reasonable then the default = 1.

standardization

the way to standardize possible raw data ("none", "withinTimeA", "withinTimeB", "withinColumn", "withinPerson", or "overall"). Only applies if the list for specifying raw data information contains the list element 'standardize=TRUE'. 'WithinTimeA' standardizes within time points and deletes cases with missing T0 data. 'WithinTimeB' does not delete cases, and in subsequent ctsem or CoTiMA applications the user is adviced to use the argument 'sameInitialTimes=TRUE'.

Value

A reshaped raw data file

Examples

## Not run: 
tmpData <- data.frame(matrix(c(1,  2,  1, 2,  1, 2,  11, 26, 1,
                               NA, NA, 3, NA, 3, NA, 12, 27, 1,
                               1,  2,  1, 2,  1, 2,  NA, 24, 0 ),
                          nrow=3, byrow=TRUE))
colnames(tmpData) <- c("first_T0", "second_T0", "first_T1", "second_T1",
                         "TD1_0", "TD1_1",
                        "time1", "time2", "sex")
shapedData <- ctmaShapeRawData(dataFrame=tmpData,
                               inputDataFrameFormat="wide",
                               inputTimeFormat="time",
                               n.manifest=2,
                               Tpoints=2,
                               orderInputVariablesNames="time",
                               targetInputVariablesNames=c("first_T0", "second_T0",
                                                           "first_T1", "second_T1"),
                               targetInputTDpredNames=c("TD1_0", "TD1_1"),
                               targetInputTIpredNames="sex",
                               targetTimeVariablesNames=c("time1", "time2"),
                               scaleTime=1/12,
                               maxTolDelta=1.2)
head(shapedData)

## End(Not run)


CoTiMA documentation built on May 29, 2024, 11:39 a.m.