conStruct: Run a conStruct analysis.

View source: R/run.conStruct.R

conStructR Documentation

Run a conStruct analysis.

Description

conStruct runs a conStruct analysis of genetic data.

Usage

conStruct(
  spatial = TRUE,
  K,
  freqs,
  geoDist = NULL,
  coords,
  prefix = "",
  n.chains = 1,
  n.iter = 1000,
  make.figs = TRUE,
  save.files = TRUE,
  ...
)

Arguments

spatial

A logical indicating whether to perform a spatial analysis. Default is TRUE.

K

An integer that indicates the number of layers to be included in the analysis.

freqs

A matrix of allele frequencies with one column per locus and one row per sample. Missing data should be indicated with NA.

geoDist

A full matrix of geographic distance between samples. If NULL, user can only run the nonspatial model.

coords

A matrix giving the longitude and latitude (or X and Y coordinates) of the samples.

prefix

A character vector giving the prefix to be attached to all output files.

n.chains

An integer indicating the number of MCMC chains to be run in the analysis. Default is 1.

n.iter

An integer giving the number of iterations each MCMC chain is run. Default is 1e3. If the number of iterations is greater than 500, the MCMC is thinned so that the number of retained iterations is 500 (before burn-in).

make.figs

A logical value indicating whether to automatically make figures once the analysis is complete. Default is TRUE.

save.files

A logical value indicating whether to automatically save output and intermediate files once the analysis is complete. Default is TRUE.

...

Further options to be passed to rstan::sampling (e.g., adapt_delta).

Details

This function initiates an analysis that uses geographic and genetic relationships between samples to estimate sample membership (admixture proportions) across a user-specified number of layers.

This function acts as a wrapper around a STAN model block determined by the user-specified model (e.g., a spatial model with 3 layers, or a nonspatial model with 5 layers). User-specified data are checked for appropriate format and consistent dimensions, then formatted into a data.block, which is then passed to the STAN model block. Along with the conStruct.results output described above, several objects are saved during the course of a conStruct call (if save.files=TRUE). These are the data.block, which contains all data passed to the STAN model block, model.fit, which is unprocessed results of the STAN run in stanfit format, and the conStruct.results, which are saved in the course of the function call in addition to being returned. If make.figs=TRUE, running conStruct will also generate many output figures, which are detailed in the function make.all.the.plots in this package.

Value

This function returns a list with one entry for each chain run (specified with n.chains). The entry for each chain is named "chain_X" for the Xth chain. The components of the entries for each are detailed below:

  • posterior gives parameter estimates over the posterior distribution of the MCMC.

    • n.iter number of MCMC iterations retained for analysis (half of the n.iter argument specified in the function call).

    • lpd vector of log posterior density over the retained MCMC iterations.

    • nuggets matrix of estimated nugget parameters with one row per MCMC iteration and one column per sample.

    • par.cov array of estimated parametric covariance matrices, for which the first dimension is the number of MCMC iterations.

    • gamma vector of estimated gamma parameter.

    • layer.params list summarizing estimates of layer-specific parameters. There is one entry for each layer specified, and the entry for the kth layer is named "Layer_k".

      • alpha0 vector of estimated alpha0 parameter in the kth layer.

      • alphaD vector of estimated alphaD parameter in the kth layer.

      • alpha2 vector of estimated alpha2 parameter in the kth layer.

      • mu vector of estimated mu parameter in the kth layer.

      • layer.cov vector of estimated layer-specific covariance parameter in the kth layer.

    • admix.proportions array of estimated admixture proportions. The first dimension is the number of MCMC iterations, the second is the number of samples, and the third is the number of layers.

  • MAP gives point estimates of the parameters listed in the posterior list described above. Values are indexed at the MCMC iteration with the greatest posterior probability.

    • index.iter the iteration of the MCMC with the highest posterior probability, which is used to index all parameters included in the MAP list

    • lpd the greatest value of the posterior probability

    • nuggets point estimate of nugget parameters

    • par.cov point estimate of parametric covariance

    • gamma point estimate of gamma parameter

    • layer.params point estimates of all layer-specific parameters

    • admix.proportions point estimates of admixture proportions.

Examples

# load example dataset
data(conStruct.data)

# run example spatial analysis with K=1
	#	
# for this example, make.figs and save.files
#	are set to FALSE, but most users will want them 
#	set to TRUE
my.run <- conStruct(spatial = TRUE,
		 			K = 1,
		 			freqs = conStruct.data$allele.frequencies,
		 			geoDist = conStruct.data$geoDist,
		 			coords = conStruct.data$coords,
		 			prefix = "test",
		 			n.chains = 1,
		 			n.iter = 1e3,
		 			make.figs = FALSE,
		 			save.files = FALSE)


conStruct documentation built on March 31, 2023, 10:13 p.m.