| BKTRRegressor | R Documentation |
A BKTRRegressor holds all the key elements to accomplish the MCMC sampling algorithm (Algorithm 1 of the paper).
data_dfThe dataframe containing all the covariates through time and space (including the response variable)
yThe response variable tensor
omegaThe tensor indicating which response values are not missing
covariatesThe tensor containing all the covariates
covariates_dimThe dimensions of the covariates tensor
logged_params_tensorThe tensor containing all the sampled hyperparameters
tauThe precision hyperparameter
spatial_decompThe spatial covariate decomposition
temporal_decompThe temporal covariate decomposition
covs_decompThe feature covariate decomposition
result_loggerThe result logger instance used to store the results of the MCMC sampling
has_completed_samplingBoolean showing wheter the MCMC sampling has been completed
spatial_kernelThe spatial kernel used
temporal_kernelThe temporal kernel used
spatial_positions_dfThe dataframe containing the spatial positions
temporal_positions_dfThe dataframe containing the temporal positions
spatial_params_samplerThe spatial kernel hyperparameter sampler
temporal_params_samplerThe temporal kernel hyperparameter sampler
tau_samplerThe tau hyperparameter sampler
precision_matrix_samplerThe precision matrix sampler
spatial_ll_evaluatorThe spatial likelihood evaluator
temporal_ll_evaluatorThe temporal likelihood evaluator
rank_decompThe rank of the CP decomposition
burn_in_iterThe number of burn in iterations
sampling_iterThe number of sampling iterations
max_iterThe total number of iterations
a_0The initial value for the shape in the gamma function generating tau
b_0The initial value for the rate in the gamma function generating tau
formulaThe formula used to specify the relation between the response variable and the covariates
spatial_labelsThe spatial labels
temporal_labelsThe temporal labels
feature_labelsThe feature labels
geo_coords_projectorThe geographic coordinates projector
summaryA summary of the BKTRRegressor instance
beta_covariates_summaryA dataframe containing the summary of the beta covariates
y_estimatesA dataframe containing the y estimates
imputed_y_estimatesA dataframe containing the imputed y estimates
beta_estimatesA dataframe containing the beta estimates
hyperparameters_per_iter_dfA dataframe containing the beta estimates per iteration
decomposition_tensorsList of all used decomposition tensors
new()Create a new BKTRRegressor object.
BKTRRegressor$new( data_df, spatial_positions_df, temporal_positions_df, rank_decomp = 10, burn_in_iter = 500, sampling_iter = 500, formula = NULL, spatial_kernel = KernelMatern$new(smoothness_factor = 3), temporal_kernel = KernelSE$new(), sigma_r = 0.01, a_0 = 1e-06, b_0 = 1e-06, has_geo_coords = TRUE, geo_coords_scale = 10 )
data_dfdata.table: A dataframe containing all the covariates through time and space. It is important that the dataframe has a two indexes named 'location' and 'time' respectively. The dataframe should also contain every possible combinations of 'location' and 'time' (i.e. even missing rows should be filled present but filled with NaN). So if the dataframe has 10 locations and 5 time points, it should have 50 rows (10 x 5). If formula is None, the dataframe should contain the response variable 'Y' as the first column. Note that the covariate columns cannot contain NaN values, but the response variable can.
spatial_positions_dfdata.table: Spatial kernel input tensor used to calculate covariates' distance. Vector of length equal to the number of location points.
temporal_positions_dfdata.table: Temporal kernel input tensor used to calculate covariate distance. Vector of length equal to the number of time points.
rank_decompInteger: Rank of the CP decomposition (Paper – R). Defaults to 10.
burn_in_iterInteger: Number of iteration before sampling (Paper – K_1). Defaults to 500.
sampling_iterInteger: Number of sampling iterations (Paper – K_2). Defaults to 500.
formulaA Wilkinson R formula to specify the relation between the response variable 'Y' and the covariates. If Null, the first column of the data frame will be used as the response variable and all the other columns will be used as the covariates. Defaults to Null.
spatial_kernelKernel: Spatial kernel Used. Defaults to a KernelMatern(smoothness_factor=3).
temporal_kernelKernel: Temporal kernel used. Defaults to KernelSE().
sigma_rNumeric: Variance of the white noise process (\tau^{-1})
defaults to 1E-2.
a_0Numeric: Initial value for the shape (\alpha) in the gamma function
generating tau defaults to 1E-6.
b_0Numeric: Initial value for the rate (\beta) in the gamma function
generating tau defaults to 1E-6.
has_geo_coordsBoolean: Whether the spatial positions df use geographic coordinates (latitude, longitude). Defaults to TRUE.
geo_coords_scaleNumeric: Scale factor to convert geographic coordinates to euclidean 2D space via Mercator projection using x & y domains of [-scale/2, +scale/2]. Only used if has_geo_coords is TRUE. Defaults to 10.
A new BKTRRegressor object.
mcmc_sampling()Launch the MCMC sampling process.
For a predefined number of iterations:
Sample spatial kernel hyperparameters
Sample temporal kernel hyperparameters
Sample the precision matrix from a wishart distribution
Sample a new spatial covariate decomposition
Sample a new feature covariate decomposition
Sample a new temporal covariate decomposition
Calculate respective errors for the iterations
Sample a new tau value
Collect all the important data for the iteration
BKTRRegressor$mcmc_sampling()
NULL Results are stored and can be accessed via summary()
predict()Use interpolation to predict betas and response values for new data.
BKTRRegressor$predict( new_data_df, new_spatial_positions_df = NULL, new_temporal_positions_df = NULL, jitter = 1e-05 )
new_data_dfdata.table: New covariates. Must have the same columns as the covariates used to fit the model. The index should contain the combination of all old spatial coordinates with all new temporal coordinates, the combination of all new spatial coordinates with all old temporal coordinates, and the combination of all new spatial coordinates with all new temporal coordinates.
new_spatial_positions_dfdata.table or NULL: A data frame containing the new spatial positions. Defaults to NULL.
new_temporal_positions_dfdata.table or NULL: A data frame containing the new temporal positions. Defaults to NULL.
jitterNumeric or NULL: A small value to add to the diagonal of the precision matrix. Defaults to NULL.
List: A list of two dataframes. The first represents the beta forecasted for all new spatial locations or temporal points. The second represents the forecasted response for all new spatial locations or temporal points.
get_iterations_betas()Return all sampled betas through sampling iterations for a given set of spatial, temporal and feature labels. Useful for plotting the distribution of sampled beta values.
BKTRRegressor$get_iterations_betas( spatial_label, temporal_label, feature_label )
spatial_labelString: The spatial label for which we want to get the betas
temporal_labelString: The temporal label for which we want to get the betas
feature_labelString: The feature label for which we want to get the betas
A list containing the sampled betas through iteration for the given labels
get_beta_summary_df()Get a summary of estimated beta values. If no labels are given, then the summary is for all the betas. If labels are given, then the summary is for the given labels.
BKTRRegressor$get_beta_summary_df( spatial_labels = NULL, temporal_labels = NULL, feature_labels = NULL )
spatial_labelsvector: The spatial labels used in summary. If NULL, then all spatial labels are used. Defaults to NULL.
temporal_labelsvector: The temporal labels used in summary. If NULL, then all temporal labels are used. Defaults to NULL.
feature_labelsvector: The feature labels used in summary. If NULL, then all feature labels are used. Defaults to NULL.
A new data.table with the beta summary for the given labels.
clone()The objects of this class are cloneable with this method.
BKTRRegressor$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create a BIXI data collection instance containing multiple dataframes
bixi_data <- BixiData$new(is_light = TRUE) # Use light version for example
# Create a BKTRRegressor instance
bktr_regressor <- BKTRRegressor$new(
formula = nb_departure ~ 1 + mean_temp_c + area_park,
data_df <- bixi_data$data_df,
spatial_positions_df = bixi_data$spatial_positions_df,
temporal_positions_df = bixi_data$temporal_positions_df,
burn_in_iter = 5, sampling_iter = 10) # For example only (too few iterations)
# Launch the MCMC sampling
bktr_regressor$mcmc_sampling()
# Get the summary of the bktr regressor
summary(bktr_regressor)
# Get estimated response variables for missing values
bktr_regressor$imputed_y_estimates
# Get the list of sampled betas for given spatial, temporal and feature labels
bktr_regressor$get_iterations_betas(
spatial_label = bixi_data$spatial_positions_df$location[1],
temporal_label = bixi_data$temporal_positions_df$time[1],
feature_label = 'mean_temp_c')
# Get the summary of all betas for the 'mean_temp_c' feature
bktr_regressor$get_beta_summary_df(feature_labels = 'mean_temp_c')
## PREDICTION EXAMPLE ##
# Create a light version of the BIXI data collection instance
bixi_data <- BixiData$new(is_light = TRUE)
# Simplify variable names
data_df <- bixi_data$data_df
spa_pos_df <- bixi_data$spatial_positions_df
temp_pos_df <- bixi_data$temporal_positions_df
# Keep some data aside for prediction
new_spa_pos_df <- spa_pos_df[1:2, ]
new_temp_pos_df <- temp_pos_df[1:5, ]
reg_spa_pos_df <- spa_pos_df[-(1:2), ]
reg_temp_pos_df <- temp_pos_df[-(1:5), ]
reg_data_df_mask <- data_df$location %in% reg_spa_pos_df$location &
data_df$time %in% reg_temp_pos_df$time
reg_data_df <- data_df[reg_data_df_mask, ]
new_data_df <- data_df[!reg_data_df_mask, ]
# Launch mcmc sampling on regression data
bktr_regressor <- BKTRRegressor$new(
formula = nb_departure ~ 1 + mean_temp_c + area_park,
data_df = reg_data_df,
spatial_positions_df = reg_spa_pos_df,
temporal_positions_df = reg_temp_pos_df,
burn_in_iter = 5, sampling_iter = 10) # For example only (too few iterations)
bktr_regressor$mcmc_sampling()
# Predict response values for new data
bktr_regressor$predict(
new_data_df = new_data_df,
new_spatial_positions_df = new_spa_pos_df,
new_temporal_positions_df = new_temp_pos_df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.