generate_dependence_from_annots: Generate epigenetic marks, SNPs and phenotypes with...

View source: R/generate_annotation_driven_data.R

generate_dependence_from_annotsR Documentation

Generate epigenetic marks, SNPs and phenotypes with epigenome-driven genetic associations

Description

Generate epigenetic marks, SNPs and phenotypes with epigenome-driven genetic associations

Usage

generate_dependence_from_annots(
  n,
  n_loci,
  mean_locus_size,
  p0,
  rho_min_x,
  rho_max_x,
  n_modules,
  mean_module_size,
  rho_min_y,
  rho_max_y,
  r,
  r0,
  prop_act,
  max_tot_pve,
  annots_vs_indep = 1,
  min_dist = 0,
  maf_thres = 0.05,
  max_nb_act_snps_per_locus = 3,
  vec_q = NULL,
  real_snp_mat = NULL,
  real_annot_mat = NULL,
  sd_act_beta = NULL,
  q_pres_annot_loci = NULL,
  bin_annot_freq = 0.05,
  candidate_modules_annots = NULL,
  tpois_lam_act_annots_mm = 1,
  sd_act_prob = 1,
  sd_pat = 1,
  sd_err = 1,
  rbeta_sh1_rr = 1,
  n_cpus = 1,
  maxit = 10000,
  module_specific = FALSE,
  user_seed = NULL,
  return_patterns = FALSE
)

Arguments

n

Number of observations.

n_loci

Number of loci.

mean_locus_size

Mean locus size (drawn from a Poisson distribution).

p0

Minimum number of active SNPs (i.e., associated with at least one phenotype).

rho_min_x

Minimum autocorrelation value for blocks of SNPs in linkage-disequilibrium.

rho_max_x

Maximum autocorrelation value for blocks of SNPs in linkage-disequilibrium.

n_modules

Number of modules of phenotypes.

mean_module_size

Mean module size (drawn from a Poisson distribution).

rho_min_y

Minimum equicorrelation value for the phenotypes in a given module. If NULL, independent SNPs simulated.

rho_max_y

Minimum equicorrelation value for the phenotypes in a given module. If NULL, independent SNPs simulated.

r

Total number of epigenetic annotations.

r0

Number of epigenetic annotations which trigger genetic associations.

prop_act

Approximate proportion of associated SNP-phenotype pairs.

max_tot_pve

Maximum variance explained by the SNPs for a given phenotype.

annots_vs_indep

Proportion of active SNPs whose effects are triggered by epigenetic marks. Default is 1, for all effects triggered by the epigenome.

min_dist

Minimum distance between each pair of loci (in terms of number of SNPs). Default is 0 for no distance enforced.

maf_thres

Minor allele frequency threshold (applied for both supplied and simulated SNPs). Default is 0.05.

max_nb_act_snps_per_locus

Maximum number of active SNPs per locus. Default is 3.

vec_q

Exact module sizes. Either mean_module_size or vec_q must be NULL. Default is NULL.

real_snp_mat

Matrix of real SNPs supplied by the user. Default is NULL for simulated SNPs under the Hardy-Weinberg assumption.

real_annot_mat

Matrix of real epigenetic annotations supplied by the user. Default is NULL for simulated binary annotations.

sd_act_beta

Standard deviation of the simulated QTL effects. Either sd_act_beta or max_tot_pve must be NULL. Default is NULL.

q_pres_annot_loci

Quantile for selecting annotations which concern most loci (i.e., at least one SNP in each locus). Should be large so enough candidate active SNPs are available when annots_vs_indep is large. Default is NULL.

bin_annot_freq

Minimum frequency of SNPs concerned by a given annotation. Default is 0.05.

candidate_modules_annots

The subset of module ids where all associations are triggered by annotations. The complement are the modules where associations are independent of the annotations. Default is NULL for all modules used as active modules. If n_modules is large, specify a smaller subset of modules, as the mapping may fail otherwise.

tpois_lam_act_annots_mm

Zero-truncated Poisson parameter for drawing the number of active annots per module. Default is 1.

sd_act_prob

Standard deviation for the effects of SNPs and annotations. Default is 1.

sd_pat

Standard deviation for the randomness of the SNP-trait pattern. Default is 1.

sd_err

Response error standard deviation. Default is 1.

rbeta_sh1_rr

Beta distribution shape2 parameter for the proportion of responses associated with an active SNP (in a given module) rbeta_sh2_rr = 1 (default), so right skewed if rbeta_sh1_rr > 1.

n_cpus

number of CPUs to be used. Default is 1.

maxit

Maximum number of iterations for the repeat loops. Default is 1e4.

module_specific

Boolean specifying whether the epigenome activation is module-specific or not. Default is FALSE

user_seed

Seed set for reproducibility. Default is NULL, no seed set.

return_patterns

Boolean specifying whether the simulated SNP-phenotype association pattern and active annotation variables.

Value

A list containing matrices of

snps

Matrix containing the simulated or supplied SNP data.

annots

Matrix containing the simulated or supplied epiegenetic annotation data.

phenos

Matrix containing the simulated phenotypic data.

pat

If return_patterns is TRUE, simulated SNP-phenotype association pattern.

beta

If return_patterns is TRUE, simulated SNP-phenotype regression coefficients.

active_annots

If return_patterns is TRUE, active annotation variables.

Examples

user_seed <- 123; set.seed(user_seed)

# Number of samples
#
n <- 500

# Loci
#
n_loci <- 20
mean_locus_size <- 100
p0 <- 10

# Modules of traits
#
n_modules <- 5
mean_module_size <- 50

# Autocorrelation within loci and equicorrelation within trait modules
#
rho_min_x <- rho_min_y <- 0.5
rho_max_x <- rho_max_y <- 0.9

# Annotations
#
r <- 200
r0 <- 10

# Association pattern
#
prop_act <- 0.1
max_tot_pve <- 0.5

list_assoc <- generate_dependence_from_annots(n, n_loci, mean_locus_size, p0,
                                              rho_min_x, rho_max_x,
                                              n_modules, mean_module_size,
                                              rho_min_y, rho_max_y, r, r0,
                                              prop_act, max_tot_pve,
                                              user_seed = user_seed)


hruffieux/echoseq documentation built on Jan. 10, 2024, 10:06 p.m.