estimateTemporalDynamic: Estimation of the temporal causal discovery parameters

estimateTemporalDynamicR Documentation

Estimation of the temporal causal discovery parameters

Description

This function estimates the number of layers and number of time steps between each layer that are needed to cover the dynamic of a temporal dataset when reconstructing a temporal causal graph. Using autocorrelation decay, the function computes the average relaxation time of the variables and, based on a maximum number of nodes, deduces the number of layers and number of time steps between each layer to be used.

Usage

estimateTemporalDynamic(
  input_data,
  state_order = NULL,
  mov_avg = NULL,
  max_nodes = 50,
  verbose_level = 1
)

Arguments

input_data

[a data frame] A data frame containing the observational data.
The expected data frame layout is variables as columns and time series/time steps as rows. The time step information must be supplied in the first column and, for each time series, be consecutive and in ascending order (increment of 1). Multiple trajectories can be provided, the function will consider that a new trajectory starts each time a smaller time step than the one of the previous row is encountered.

state_order

[a data frame] An optional data frame providing extra information about variables. It must have d rows where d is the number of input variables, excluding the time step one.
For optional columns, if they are not provided or contain missing values, default values suitable for input_data will be used.

The following structure (named columns) is expected:

"var_names" (required) contains the name of each variable as specified by colnames(input_data), excluding the time steps column.

"var_type" (optional) contains a binary value that specifies if each variable is to be considered as discrete (0) or continuous (1). Discrete variables will be excluded from the temporal dynamic estimation.

"is_contextual" (optional) contains a binary value that specifies if a variable is to be considered as a contextual variable (1) or not (0). Contextual variables will be excluded from the temporal dynamic estimation.

"mov_avg" (optional) contains an integer value that specifies the size of the moving average window to be applied to the variable. Note that if "mov_avg" column is present in the state_order, its values will overwrite the function parameter.

mov_avg

[an integer] Optional, NULL by default.
When an integer>= 2 is supplied, a moving average operation is applied to all the non discrete and not contextual variables. If no state_order is provided, the discrete/continuous variables are deduced from the input data. If you want to apply a moving average only on specific columns, consider to use a mov_avg column in the state_order parameter.

max_nodes

[a positive integer] The maximum number of nodes in the final time-unfolded causal graph. The more nodes allowed in the temporal causal discovery, the more precise will be the discovery but at the cost of longer execution time. The default is set to 50 for fast causal discovery. On recent computers, values up to 200 or 300 nodes are usually possible (depending on the number of trajectories and time steps in the input data).

verbose_level

[an integer value in the range [0,2], 1 by default] The level of verbosity: 0 = no display, 1 = summary display, 2 = full display.

Value

A named list with two items:

  • n_layers: the number of layers

  • delta_t: the number of time steps between the layers


miic documentation built on Sept. 18, 2024, 1:07 a.m.