View source: R/ctmaShapeRawData.R
ctmaShapeRawData | R Documentation |
Raw data objects are re-shaped (dealing with missing time points, wrong time intervals etc)
ctmaShapeRawData(
dataFrame = NULL,
id = NULL,
inputDataFrameFormat = NULL,
inputTimeFormat = "time",
missingValues = NA,
n.manifest = NULL,
manifest.per.latent = NULL,
Tpoints = NULL,
allInputVariablesNames = NULL,
orderInputVariablesNames = NULL,
targetInputVariablesNames = NULL,
targetInputTDpredNames = NULL,
targetInputTIpredNames = NULL,
targetTimeVariablesNames = NULL,
outputDataFrameFormat = "long",
outputVariablesNames = "Y",
outputTDpredNames = NULL,
outputTIpredNames = NULL,
outputTimeVariablesNames = "time",
outputTimeFormat = "time",
scaleTime = 1,
minInterval = 1e-04,
minTolDelta = NULL,
maxTolDelta = NULL,
negTolDelta = FALSE,
min.val.n.Vars = 1,
min.val.Tpoints = 1,
standardization = "none"
)
dataFrame |
an R object containing data |
id |
the identifier of subjects if data are in long format |
inputDataFrameFormat |
"wide" or "long" |
inputTimeFormat |
"time" (default) or "delta" |
missingValues |
Missing value indicator, e.g., -999 or NA (default) |
n.manifest |
Number of process variables (e.g, 2 in a bivariate model) |
manifest.per.latent |
n.manifest per latent factor. Frequently 1 manifest per latent, but e.g. c(2,3,1) also possible for 6 manifest loading on 3 latents |
Tpoints |
Number of time points in the data frame |
allInputVariablesNames |
vector of all process variable names, time dependent predictor names, time independent predictor names, and names of times/deltas. Only required if the dataFrame does not have column names. |
orderInputVariablesNames |
= "names" vs "time" (e.g., names: X1, X2, X3, Y1, Y2, X3 vs time: X1, Y1, X2, Y2, ... ). For ctsem/CoTiMA, the output file will order by time. |
targetInputVariablesNames |
= the process variables in the dataFrame that should be used (in "names" or in "times" order; e.g., c("X1", "X3", "Y1", "X3") ). This is used to delete variables from the data frame that are not required. |
targetInputTDpredNames |
The actual time dependent (TD) predictor variable names, e.g, 3, or 6, or 9, ... names if Tpoints = 3. Internally, each of the 3, 6, etc represents one TDpred. One typically does NOT have TD predictors in a CoTiMA. |
targetInputTIpredNames |
time independet (TI) predictor names names in the dataFrame. One typically does NOT have TI predictors in CoTiMA except it uses raw data only, where TIpreds are avalaible for individual cases. |
targetTimeVariablesNames |
The time variables names in the dataFrame. They also define which Tpoints will be included in the output file , e.g., c("Time4", "Time9"). |
outputDataFrameFormat |
"long" (default) or "wide" |
outputVariablesNames |
"Y" (default; creates Y1_T0, Y2_T0, Y1_T1, Y2_T1, etc.), but can also be, e.g., c("X", "Y"; creates X_T0, Y_T0, X_T1, Y_T1, etc.). |
outputTDpredNames |
Will become "TD" if not specified |
outputTIpredNames |
Will become "TI" if not specified |
outputTimeVariablesNames |
"time" (default) |
outputTimeFormat |
"time" (default) or "delta" |
scaleTime |
A scalar that is used to multiply the time variable. Typical use is rescaling primary study time to the time scale use in other primary studies. For example, scaleTime=1/(60 x 60 x 24 x 365.25) rescales time provided in seconds (frequent case when imported from SPSS) into years (60sec x 60min x 24hrs x 365.25days incl. leap years). |
minInterval |
A parameter (default = 0.0001) supplied to ctIntervalise. Set to smaller values than any possible observed measurement interval, but larger than 0.0001. The value is used for indicating unavailable time interval information (caused by missing values) because NA is technically not possible for time intervals. |
minTolDelta |
Set, e.g. to 1/24, to delete variables from time points that are too close (e.g., 1hr; or even before) after another time point. Could be useful to delete values generated by unreliable responding, e.g., in diary studies. Note that minTolDelta applies to the time intervals AFTER the scaleTime argument has applied (i.e., scaleTime may need adaptation for each primary study, but minTolDelta does not). |
maxTolDelta |
Set, e.g., to 7, to delete variables from time points that are too far after another time point (e.g., 7 days, if all participants should have responed within a week). Note that maxTolDelta applies to the time intervals AFTER the scaleTime argument has applied (i.e., scaleTime may need adaptation for each primary study, but minTolDelta does not). |
negTolDelta |
FALSE (default) or TRUE. Delete entire cases that have at least one negative delta ('unreliable responding'; use minTolDelta to delete certain variables only) |
min.val.n.Vars |
min.val.n.Vars = Minimum no. of valid variables. Default = 1 (retaines cases with only 1 valid variable), 0 would retain cases will all variables missing (not very useful). Retaining participants who provide a single valid variable is technically possible, but these participants contribute to the estimation of the variance/mean of this variable only. Since variance/mean are 1/0 in most CoTiMA applications, this is not very informative but at the cost of additional computational burden. Setting min.val.n.Vars = 2 is recommended. |
min.val.Tpoints |
Minimum no. of valid Tpoints (i.e. Tpoints where min.val.n.Vars is met). Default = 1 retains participants with full set of valid variables at least at one single Tpoint (which will become T0). Setting min.val.Tpoints = 2 or higher values retains participants which provide longitudinal information. Since T0 covariances are usually not too interesting, min.val.Tpoints = 2 may be more reasonable then the default = 1. |
standardization |
the way to standardize possible raw data ("none", "withinTimeA", "withinTimeB", "withinColumn", "withinPerson", or "overall"). Only applies if the list for specifying raw data information contains the list element 'standardize=TRUE'. 'WithinTimeA' standardizes within time points and deletes cases with missing T0 data. 'WithinTimeB' does not delete cases, and in subsequent ctsem or CoTiMA applications the user is adviced to use the argument 'sameInitialTimes=TRUE'. |
A reshaped raw data file
## Not run:
tmpData <- data.frame(matrix(c(1, 2, 1, 2, 1, 2, 11, 26, 1,
NA, NA, 3, NA, 3, NA, 12, 27, 1,
1, 2, 1, 2, 1, 2, NA, 24, 0 ),
nrow=3, byrow=TRUE))
colnames(tmpData) <- c("first_T0", "second_T0", "first_T1", "second_T1",
"TD1_0", "TD1_1",
"time1", "time2", "sex")
shapedData <- ctmaShapeRawData(dataFrame=tmpData,
inputDataFrameFormat="wide",
inputTimeFormat="time",
n.manifest=2,
Tpoints=2,
orderInputVariablesNames="time",
targetInputVariablesNames=c("first_T0", "second_T0",
"first_T1", "second_T1"),
targetInputTDpredNames=c("TD1_0", "TD1_1"),
targetInputTIpredNames="sex",
targetTimeVariablesNames=c("time1", "time2"),
scaleTime=1/12,
maxTolDelta=1.2)
head(shapedData)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.