x.validation: Run a conStruct cross-validation analysis
In conStruct: Models Spatially Continuous and Discrete Population Genetic Structure

x.validation

R Documentation

Run a conStruct cross-validation analysis

Description

x.validation runs a conStruct cross-validation analysis

Usage

x.validation(
  train.prop = 0.9,
  n.reps,
  K,
  freqs = NULL,
  data.partitions = NULL,
  geoDist,
  coords,
  prefix,
  n.iter,
  make.figs = FALSE,
  save.files = FALSE,
  parallel = FALSE,
  n.nodes = NULL,
  ...
)

Arguments

`train.prop`	A numeric value between 0 and 1 that gives the proportions of the data to be used in the training partition of the analysis. Default is 0.9.
`n.reps`	An `integer` giving the number of cross- validation replicates to be run.
`K`	A numeric `vector` giving the numbers of layers to be tested in each cross-validation replicate. E.g., `K=1:7`.
`freqs`	A `matrix` of allele frequencies with one column per locus and one row per sample. Missing data should be indicated with `NA`.
`data.partitions`	A list with one element for each desired cross-validation replicate. This argument can be specified instead of the `freqs` argument if the user wants to provide their own data partitions for model training and testing. See the model comparison vignette for details on what this should look like.
`geoDist`	A `matrix` of geographic distance between samples. If `NULL`, user can only run the nonspatial model.
`coords`	A `matrix` giving the longitude and latitude (or X and Y coordinates) of the samples.
`prefix`	A character `vector` giving the prefix to be attached to all output files.
`n.iter`	An `integer` giving the number of iterations each MCMC chain is run. Default is 1e3. If the number of iterations is greater than 500, the MCMC is thinned so that the number of retained iterations is 500 (before burn-in).
`make.figs`	A `logical` value indicating whether to automatically make figures during the course of the cross-validation analysis. Default is `FALSE`.
`save.files`	A `logical` value indicating whether to automatically save output and intermediate files once the analysis is complete. Default is `FALSE`.
`parallel`	A `logical` value indicating whether or not to run the different cross-validation replicates in parallel. Default is `FALSE`. For more details on how to set up runs in parallel, see the model comparison vignette.
`n.nodes`	Number of nodes to run parallel analyses on. Default is `NULL`. Ignored if `parallel` is `FALSE`. For more details in how to set up runs in parallel, see the model comparison vignette.
`...`	Further options to be passed to rstan::sampling (e.g., adapt_delta).

Details

This function initiates a cross-validation analysis that uses Monte Carlo cross-validation to determine the statistical support for models with different numbers of layers or with and without a spatial component.

Value

This function returns (and also saves as a .Robj) a list containing the standardized results of the cross-validation analysis across replicates. For each replicate, the function returns a list with the following elements:

sp - the mean of the standardized log likelihoods of the "testing" data partition of that replicate for the spatial model for each value of K specified in K.
nsp - the mean of the standardized log likelihoods of the "testing" data partitions of that replicate for the nonspatial model for each value of K specified in K.

In addition, this function saves two text files containing the standardized cross-validation results for the spatial and nonspatial results (prefix_sp_xval_results.txt and prefix_nsp_xval_results.txt, respectively). These values are written as matrices for user convenience; each column is a cross-validation replicate, and each row gives the result for a value of K.

conStruct documentation built on May 29, 2024, 4:23 a.m.