poolne_estim_input: Generate input files for 'poolne_estim'

View source: R/poolne_estim_input.R

poolne_estim_inputR Documentation

Generate input files for poolne_estim

Description

This function prepares biallelic SNP data for analysis by the programme poolne_estim.

Usage

poolne_estim_input(
  dat,
  pool.info,
  runID,
  mcmc.sampling = c(1000, 25, 5000, 20, 50, sample(as.integer(1000:9999), 1))
)

Arguments

dat

Data table: The biallelic SNP data. Requires all of the following columns:

  1. $POOL = The population pool ID.

  2. $SAMPLE = The sample replicate ID; e.g. 1, 2, or 3, if three replicates were made.

  3. $CHROM = The chromosome ID.

  4. $LOCUS = The locus ID.

  5. $RO = The number of Ref read counts at the locus.

  6. $AO = The number of Alt read counts at the locus.

pool.info

Data table: The population pool metadata. Requires all of the following columns:

  1. $POOL = The population pool ID.

  2. $INDS = The number of diploid individuals in each pooled library.

runID

Character: Appended inside filename to indicate a particular run or sim identifier.

mcmc.sampling

A vector MCMC sampling parameters for poolne_estim. Can be customised, but the can be left alone (these are the defaults in poolne_estim). Must have 6 values:

  1. The number of values to sample from the posterior distribution (e.g. 1000).

  2. The second Thinning rate (e.g. 25).

  3. The burn-in period length (e.g. 5000).

  4. The maximal number of pilot runs (e.g. 20) to adjust the parameters of the proposal distributions.

  5. The pilot run length (e.g. 500).

  6. A positive interger for random seed generation.

Value

Four files are written to the working directory:

  1. [runID]_[poolID]_Ref.txt = Counts of Ref alleles; loci in rows, sample replicates in columns.

  2. [runID]_[poolID]_Dep.txt = Depth of reads; loci in rows, sample replicates in columns.

  3. [runID]_[poolID]_Loci.txt = A list of loci in the order they appear in the count TXT files.

  4. [runID]_[poolID]_Input.txt = The params input file for poolne_estim.

Examples

# Load in the pool metadata and reads
data(data_PoolInfo)
data(data_PoolReads)

# Have a look at the data: the samples are populations
# ('Pop') and replicate library preps ('Rep').
data_PoolReads$SAMPLE

# You need to make sure there is a column that contains
# the pool ID. Split the SAMPLE column and return the first value:
X <- data_PoolReads
X$POOL <- unlist(lapply(strsplit(X$SAMPLE, '_'), function(X){ return(X[1]) }))

# Check
X

# Now make inputs
poolne_estim_input(dat=X, pool.info=data_PoolInfo, runID='genomalicious')


j-a-thia/genomalicious documentation built on Oct. 19, 2024, 7:51 p.m.