calcRegMod: Calculate Regression Model under User Input Demography...

Description Usage Arguments Value Note Author(s) References See Also Examples

View source: R/calcRegMod.R

Description

This function computes the regression model for user input demographic scenarios. Moreover, the user is able to handle the sample sizes, lengths, and recombination rates of the simulated populations.

Usage

1
2
calcRegMod(n = c(10,16,20), len = c(500,1000,2000,3000,5000), thth = 0.01, nsim = 100,
           fr = c(), pathToScrm, scenario, pathToMs2dna, status = T, pathLDhat, pathPhi)

Arguments

n

A numeric vector containig by default 10, 16, and 20 reflecting the sample sizes of the simulated populations. It can be adapted to any vector.

len

A numeric vector containing the lengths of simulated sequences of the populations. By default 0.5, 1, 2, 3, and 5 kb but can be adapted to any integer values.

thth

A numeric value for the mutation rate theta under which the populations are simulated. By default 0.01 but can be adapted to any numeric value.

fr

A numeric vector containing the recombination rates under which one wants to simulate. By default it is set to an empty vector and uniform random variables are simulated from 5 intervals with nsim values per interval.

nsim

An integer value for the number of replications (populations) simulated per setup. Setups result from all combinations of sample sizes and sequence lengths. This value can be adapted to any integer value.

pathToScrm

A character string containing the path to scrm. This path and the installation of scrm is necessary for the computation of the function.

scenario

A character string containing the demography model (scenario) under which the populations should be simulated. We refer to scrm for details on how to define varying population sizes using the simulation package scrm.

pathToMs2dna

A character string containing the path to ms2dna. This path and the installation of ms2dna is necessary for the computation of the function.

status

an optional logical value: by default TRUE such that the current processing status of the number of simulated populations is printed.

pathLDhat

A character string containing the path to LDhat. This path and the installation of LDhat is necessary for the computation of the package.

pathPhi

A character string containing the path to PhiPack. This path and the installation of PhiPack is necessary for the computation of the package.

Value

regMod

The generalized additive regression model on the box-cox transformed true recombination rates using computed summary statistics from simulated populations under a user defined demography (scenario).

data.all

A data-frame containing all summary statistics per column and simulated samples of populations per row.

Note

This function only works with unix and having PhiPack installed. Optionally when also having LDhat (Auton and McVean (2007)) installed LDJump will compute estimates much faster. Hence, please properly check all paths to PhiPack and in case also LDhat as well as the sequence files. Moreover, the software packages scrm and ms2dna need to be installed for simulating populations under a user input demography (scenario).

Author(s)

Philipp Hermann philipp.hermann@jku.at, Andreas Futschik

References

Auton, A. and McVean, G. (2007). Recombination rate estimation in the presence of hotspots. Genome Research, 17(8), 1219-1227.

Bruen, T. C., Philippe, H., and Bryant, D. (2006). A simple and robust statistical test for detecting the presence of recombination. Genetics, 172(4):2665-2681.

Frick, K., Munk, A., and Sieling, H. (2014). Multiscale change-point inference. Journal of the Royal Statistical Society: Series B, 76(3), 495-580.

Futschik, A., Hotz, T., Munk, A., and Sieling, H. (2014). Multiscale DNA partitioning: Statistical evidence for segments. Bioinformatics, 30(16), 2255-2262.

Hermann, P., Heissl, A., Tiemann-Boege, I., and Futschik, A. (2019), LDJump: Estimating Variable Recombination Rates from Population Genetic Data. Mol Ecol Resour. doi:10.1111/1755-0998.12994.

Jombart T. and Ahmed I. (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics. doi:10.1093/bioinformatics/btr521

Knaus BJ and Grünwald NJ (2017). VCFR: a package to manipulate and visualize variant call format data in R. Molecular Ecology Resources, 17(1), pp. 44-53. ISSN 757, doi:10.1111/1755-0998.12549.

McVean, G. A. T., Myers, S. R., Hunt, S., Deloukas, P., Bentley, D. R., and Donnelly, P. (2004). The fine-scale structure of recombination rate variation in the human genome. Science, 304(5670), 581-584.

Paradis E., Claude J. & Strimmer K. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20: 289-290.

The 1000 Genomes Project Consortium (2015). Aglobal reference for human genetic variation. Nature, 526(7571), 68-74.

Wood, S.N. (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B) 73(1):3-36

See Also

link{LDJump}, summary_statistics, vcfR_to_fasta, getPhi, get_smuce, smuceR, rq, gam, vcfR2DNAbin, diseq, genotype, readDNAStringSet

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
##### Do not run these examples                                         #####
##### scenario =  " -eG 0.0 0 -eG 0.42 -100 -eG 0.5 100 "               #####
##### simulatedData = calcRegMod(nsim=100,pathToScrm="/path/To/Scrm/",  #####
#####                scenario=scenario,pathToMs2dna="/path/To/Ms2dna/", #####
#####                 pathLDhat = "/path/to/LDhat/",                    #####
#####                 pathPhi = "/path/to/Phi/")                        #####
##### regMod = simulatedData[[1]]                                       #####
##### result = LDJump(fileName, alpha = 0.05, segLength = 1000,         #####
#####                 pathLDhat = "/path/to/LDhat/",                    #####
#####                 pathPhi = "/path/to/Phi/",                        #####
#####                 format = "fasta", regMod = regMod)                #####

PhHermann/LDJump documentation built on Nov. 16, 2019, 12:53 p.m.