runFSC_step_agg3: Run fastsimcoal

View source: R/runFSC_step_agg3.R

runFSC_step_agg3R Documentation

Run fastsimcoal

Description

Takes a population history from the forward demographic simulation and parameterizes a coalescent simulation in fastsimcoal (Excoffier et al. 2013)

Usage

runFSC_step_agg3(
  ph = ph,
  l = landscape,
  sample_n = NULL,
  preLGMparms = NULL,
  label = "test",
  delete_files = TRUE,
  num_cores = 1,
  exec = "fsc25",
  loc_parms = NULL,
  found_Ne = NULL,
  gmap = NULL,
  MAF = NULL,
  maxloc = 2e+05
)

Arguments

ph

A pophist object, output by the forward demographic simulation function: getpophist2.cells. Includes slots: pophist, Nvecs, tmat, struct, hab_suit, coalhist, popslst, old_hab_suit, old_tmat, gmap. Cells in typical landscapes (>400 cells) should be aggregated after the forward-time simulation using the functions make.gmap and pophist.aggregate to improve computational efficiency.

l

An object that defines the simulation landscape from the forward demographic simulation, includes slots: details, occupied, empty, sampled, hab_suit, sumrast, samplocsrast, samplocs, sampdf, NAmask.

sample_n

The number of sampled (diploid) individuals per population. Assumed to be equal across all sampled populations.

preLGMparms

A one row data frame of parameters related to the size and history of populations in glacial refugia: preLGM_t (time in generations at which refugial populations are assumed to diverge, e.g., previous interglacial), preLGM_Ne (effective population size in the ancestral population), ref_Ne (effective size of populations in refugial cells).

label

A text string specifying a filename prefix for output from fastsimcoal. Note that hyphens (-) should be avoided; they are converted to periods (.) in files output by fastsimcoal. In order to delete files with strataG::fscCleanup, avoid hyphens in file names.

delete_files

A logical (TRUE/FALSE) indicating whether temporary simulation files and directories from fastsimcoal should be deleted after genetic data are read into R. If TRUE, strataG::fscCleanup is used to remove files output during the coalescent simulation.

num_cores

Number of processors to use for fastsimcoal simulations.

exec

The filename for the fastsimcoal executable.

loc_parms

A one row data frame of parameters providing details on the simulated genetic markers. Contains the following named elements: marker (the type of marker to simulate, e.g., "snp"), nloci (the final number of loci included in the output dataset), seq_length (the assumed sequence length of simulated loci, if marker = "snp"), mu (the mutation rate for the simulated marker).

found_Ne

A parameter describing the effective population size associated with newly colonized populations in the coalescent simulation.

gmap

A map specifying a scheme for aggregating cells from the finer grained forward demographic simulation for more efficient coalescent simulation.

MAF

A numeric value specifying a minor allele frequency filter. SNP loci with minor alleles below this frequency are excluded from output.

maxloc

The maximum number of loci to attempt to simulate in fastsimcoal. If too few variable sites are generated, additional markers are simulated up to this value.

Details

Provides a wrapper for several helpful functions from the strataG R package. Input parameters include a pophist object (from getpophist2.cells), a landscape object, and parameter values associated with refugial populations and genetic markers. Population size changes are limited to step-changes from founding population size (found_Ne) to the population size at the end of the forward simulation (from elements of ph$Nvecs). To improve computational efficiency, coalescent simulations are conducted for aggregated groups of cells, with an aggregation scheme defined by the function make.gmap. SNP data output by this function are subsampled to retain only one variable position from each simulated DNA segment (locus). Note that the fastsimcoal executable must be installed and in a directory in the user's $PATH for coalescent simulations.

Value

Returns a data frame of individual SNP genotypes from fastsimcoal. The first column of this data frame provides an individual ID, the second specifies the population or deme associated with the individual, and columns that follow provide diploid genotypes for simulated individuals (in two-column format).

See Also

fscTutorial, fscWrite, fscRun, fscReadArp, getpophist2.cells, pophist.aggregate, make.gmap, holoStats, http://cmpg.unibe.ch/software/fastsimcoal26/man/fastsimcoal26.pdf

Examples

library(holoSimCell)
parms <- drawParms(control = system.file("extdata/ashpaper","Ash_priors.csv",package="holoSimCell"))
load(file=paste0(system.file(package="holoSimCell"),"/extdata/landscapes/",pollenPulls[[1]]$file))
refpops <- pollenPulls[[1]]$refs
avgCellsz <- mean(c(res(landscape$sumrast)))

ph = getpophist2.cells(h = landscape$details$ncells, xdim = landscape$details$x.dim, ydim = landscape$details$y.dim,
                       landscape=landscape,
                       refs=refpops,   
                       refsz=parms$ref_Ne,
                       lambda=parms$lambda,
                       mix=parms$mix,  
                       shortscale=parms$shortscale*avgCellsz,  
                       shortshape=parms$shortshape, 
                       longmean=parms$longmean*avgCellsz,  
                       ysz=res(landscape$sumrast)[2], 
                       xsz=res(landscape$sumrast)[1], 
                       K = parms$Ne) 

gmap=make.gmap(ph$pophist,
               xnum=2, #number of cells to aggregate in x-direction
               ynum=2) #number of aggregate in the y-direction

ph2 <- pophist.aggregate(ph,gmap=gmap)

loc_parms <- data.frame(marker = "snp",
                        nloci = parms$nloci,           
                        seq_length = parms$seq_length,
                        mu = parms$mu)
  
preLGMparms <- data.frame(preLGM_t = parms$preLGM_t/parms$G,   
                          preLGM_Ne = parms$preLGM_Ne,
                         ref_Ne = parms$ref_Ne)

out <- runFSC_step_agg3(ph = ph2,
                        l = landscape,
                        sample_n = 14,
                        preLGMparms = preLGMparms,
                        label = "test",
                        delete_files = TRUE,
                        num_cores = 1,
                        exec = "fsc26",
                        loc_parms = loc_parms,
                        found_Ne = parms$found_Ne,
                        gmap = gmap,
                        MAF = 0.01,
                        maxloc = 50000)


stranda/holoSimCell documentation built on Aug. 4, 2023, 1:12 p.m.