LG_boot_approx_scribe: Local Gaussian Approximations for bootstrap replicates,...

Description Usage Arguments Details Value

View source: R/LG_boot_approx_scribe.R

Description

This function takes care of the bookkeeping when we want to investigate a time series by means of local Gaussian Approximations used on bootstrapped replicates of the time series.

Usage

1
2
3
4
5
LG_boot_approx_scribe(main_dir, data_dir, lag_max = NULL,
  LG_points = NULL, content_details = NULL, LG_type = NULL,
  .bws_mixture = NULL, bw_points = NULL, .bws_fixed = NULL,
  .bws_fixed_only = NULL, nb = NULL, boot_type = NULL,
  block_length = NULL, boot_seed = NULL, threshold = 500)

Arguments

main_dir

The path to the main directory, that contains the file-hierarchy created when using the local Gassian approach for the investigation of time series.

data_dir

A specification of the directory to be used when loading and saving data.

lag_max

The number of lags to include in the analysis.

LG_points

An array that specifies the point at which it is desired to compute the local Gaussian estimates. The default value NULL will imply that the values used in the computations upon the original data are recycled. Values can alternatively be computed by the help of the function LG_select_points, but these must then be a subset of the original values.

content_details

A value from c("rho_only", "rho_log.fun", "rho_all"), which decides the amount of details stored from the local Gaussian approximations.

LG_type

One of c("par_five", "par_one"), i.e. should the Local Gaussian Autocorrelations be based on the approach using five parameters or the simplified approach using only one parameter. The default choice is "par_five".

.bws_mixture

An argument that specifies how the global bandwidths and those obtained by the nearest-neighbour strategy should be combined. The three available options are c("mixture", "local", "global"), which have the following effects. The alternatives local and global will respectively only select the nearest neighbour or global. These alternatives seems however to not work well when used on some parts of the lagged pairs of interest, i.e. the nearest neighbour values might be to "small" in the center of the distribution, whereas the global bandwidths seems to fail in the periphery of the distribution. The alternative mixture attempt to resolve this by (for each grid point) selecting the largest of the two alternative bandwidths. Note that the value of .bws_mixture decides how much information that is computed, i.e. the alternative local will turn off the computation of global bandwidths. However, the computations of the nearest neighbour bandwidths will also be computed when the alternative global is used, since it does not take long to compute and it is that function that creates the array we need as a mould for the result. If the user does not make a selection, then all three alternatives will be computed.

bw_points

A vector, default c(25, 35), that specifies the percentage of the observations that we want inside the "bandwidth-square". If .bws_mixture is selected to be global, then this argument will be ignored. and no nearest neighbours will be computed.

.bws_fixed

A vector of non-negative real values, that can be used to specify fixed values for the bandwidths (which might be of interest to do in a preliminary analysis). The default value NULL will prevent the computation of Local Gaussian Estimates based on fixed bandwidths.

.bws_fixed_only

A logic value, default FALSE, that can be used to drop the rather time-consuming data-driven estimation of bandwidths in favour of the simplified approach where fixed bandwidths are used instead. Note that .bws_fixed must be specified when .bws_fixed_only are set to TRUE.

nb

An integer that specifies how many bootstrap-replicates we want to use in our analysis. Default value 5 (at least in the development phase).

boot_type

This one can be used to select what kind of bootstrap-algorithm to be used. Default value "block", but it has not been implemented any other algorithms yet... (Reminder: It would be be nice to include the 'network-duality' approach in this context – but then I fear I might need to implement that routine from scratch.)

block_length

The length of the blocks to be used when boot_type="block" is used. Default value 20 in the development phase, but I suppose in general it should be some formula based on the time series under investigation.

boot_seed

Use this to enable reproducible results. The default value NULL will trigger a random seed to be selected for this value (that then will be recorded in case a reproduction of the result is desired later on).

threshold

An integer, default value 500 (measured in MB), that tells the program when a computation should be divided into smaller chunks. This reduces the chance of memory-related problems, but as the present incarnation of LG_splitter are rather stupid such problems might still occur for long time-series where a huge number of lags are included.

Details

This function records its arguments and compares them to a previously stored information-object for the time series under investigation, in order to avoid redoing previously performed computations. For new computations, relevant data will be extracted from data_dir, which then will be analysed in order to see if memory issues requires that the computation should be performed in smaller chunks. The function then calls LG_boot_approx that does the computations, and the result is then added to the file-structure. Finally the information-object will be updated and a two component value is returned to the work-flow.

Note that no default values are given for any of the arguments, and that solution is made in order to dissuade users from calling this (often quite time consuming) function directly from the work-space. The intention is that the bootstrap-wrapper should call this function, and then with arguments inherited (restrictions are allowed) from those used in a previous analysis of a time series. This is done since the main motivation for working with bootstrapped replicates of our original time series is to obtain bootstrap-based confidence intervals for the Local Gaussian Spectral Densities.

Value

This function will return a two component list to the work-flow. The first component is the logic value done_before that reveals whether or not the result already existed (in the specified file-structure), whereas the second component data_dir gives the location of the saved data.


LAJordanger/localgaussSpec documentation built on Nov. 27, 2018, 11:34 p.m.