setCovariate: Definition of a time-dependent covariate dataset

View source: R/dataConstruction_workflow.R

setCovariateR Documentation

Definition of a time-dependent covariate dataset

Description

A time-dependent covariate dataset specifies all follow-up measurements of a single variable. For each subject in the cohort, the table specifies:

  1. a unique subject identifier,

  2. the date of the covariate measurement,

  3. the covariate value.

Usage

setCovariate(
  data,
  type,
  IDvar,
  L_date,
  L_name,
  categorical,
  impute = NA,
  impute_default_level = NA,
  acute_change = FALSE
)

Arguments

data

data.table containing an input covariate dataset to be wrapped in and processed. The table can contain multiple rows per subject. There should be no row for subjects with no follow-up covariate measurements. The table must contain at most one covariate measurement on a given date for any given subject. Cannot contain missing values. Covariate values must be encoded by a character or numeric vector (e.g., factors are not allowed). Cannot have columns named 'IDvar', 'L_date', or 'L_name'.

type

character specifying the covariate type: 'binary monotone increasing' (e.g., history of a diagnosis or procedure), 'interval' (e.g., hospital stay or prescription coverage), 'sporadic' (e.g., laboratory measurements), 'indicator' (e.g., occurrence of a repeatable event).

IDvar

character providing the name of the column of data that contains the unique subject identifier.

L_date

character providing the name of the column of data that contains the date of the follow-up covariate measurement.

L_name

character providing the name of the column of data that contains the covariate values. For all covariate types, L_name must also be the name of the column of the cohort dataset that contains the baseline measurements of the time-dependent covariate. Baseline measurements of a covariate of type 'binary monotone increasing' can only be encoded with values 0 and 1 in the cohort dataset. All values in the column L_name of data must be set to 1 for a covariate of type 'binary monotone increasing'. The column L_name in data cannot contain the value 'None' for a covariate of type 'indicator' that is character. The column L_name in data and in the cohort dataset cannot contain the value 0 for a covariate of type 'indicator' that is numeric.

categorical

logical indicating whether the covariate is continuous ('FALSE') or categorical ('TRUE'). Must be 'TRUE' for a covariate of type 'binary monotone increasing' or 'indicator'. Cannot be missing.

impute

character specifying imputation method for missing baseline measurements: 'default', 'mean', 'mode', 'median'. If missing, imputation with the 'mean' and 'mode' is used for continuous and categorical covariates, respectively. Imputation with 'mean', 'mode', or 'median' is based on baseline measurements from subjects with observed baseline covariate values (stored in the cohort dataset). 'mean' and 'median' can only be used for continuous covariates. 'mode' can only be used for categorical covariates. Imputation with 'default' replaces missing values with 0 if the covariate is numeric and with 'Unknown' otherwise. Ignored for a covariate of type 'binary monotone increasing', 'interval', or 'indicator'.

impute_default_level

character or numeric specifying the imputation value to be used when impute='default'. The value must be a length 1 character (resp. numeric) for a covariate encoded by a character (resp. numeric) vector. If missing, the default values 0 and 'Unknown' are used for continuous and categorical covariates, respectively. Ignored for a covariate of type 'binary monotone increasing', 'interval', or 'indicator'.

acute_change

logical indicating whether a covariate measurement collected on the date of an exposure change can be impacted by the change. The default value 'FALSE' indicates that the covariate measurement can be assumed to have preceded and possibly triggered the change in exposure. Cannot be missing.

Value

timeDepCovData object

See Also

timeDepCovData

Examples

covariate1 <- setCovariate(a1cDT, "sporadic", "ID", "A1cDate", "A1c",
                           categorical = FALSE)
covariate2 <- setCovariate(egfrDT, "sporadic", "ID", "eGFRDate", "eGFR",
                           categorical = TRUE)

romainkp/LtAtStructuR documentation built on Aug. 24, 2024, 3:38 p.m.