rbm.GRR: Simulation of genetic data using GRR values

View source: R/rbm_GRR.r

rbm.GRRR Documentation

Simulation of genetic data using GRR values

Description

Generates a simulated bed.matrix with genotypes for cases and controls based on GRR values

Usage

rbm.GRR(genes.maf = Kryukov, size, prev, replicates, 
        GRR.matrix.del, GRR.matrix.pro = NULL, 
        p.causal = 0.5, p.protect = 0, same.variant = FALSE, 
        genetic.model=c("general", "multiplicative", "dominant", "recessive"), 
        select.gene, selected.controls = T, max.maf.causal = 0.01)

Arguments

genes.maf

A dataframe containing at least the MAF in the general population (column maf) for variants with their associated gene (column gene), by default the file Kryukov is used

size

A vector containing the size of each group (the first one being the control group)

prev

A vector containing the prevalence of each group of cases

replicates

The number of simulations to perform

GRR.matrix.del

A list containing the GRR matrix associated to the heterozygous genotype compared to the homozygous reference genotype as if all variants are deleterious. An additional GRR matrix associated to the homozygous for the alternate allele is needed if genetic.genetic.model="general"

GRR.matrix.pro

The same argument as GRR.matrix.del but for protective variants

p.causal

The proportion of causal variants in cases

p.protect

The proportion of protective variants in cases among causal variants

same.variant

TRUE/FALSE: whether the causal variants are the same in the different groups of cases

genetic.model

The genetic model of the disease

select.gene

Which gene to choose from genes.maf$gene if multiple genes are present. If missing, only the first level is kept.

selected.controls

Whether controls are selected controls (by default) or controls from the general population

max.maf.causal

Only variants with a MAF lower than this threshold can be sampled as causal variants.

Details

The genetic model of the disease needs to be specified in this function.

If genetic.model="general", there is no link between the GRR for the heterozygous genotype and the GRR for the homozygous alternative genotype. Therefore, the user has to give two matrices of GRR, one for the heterozygous genotype, the other for the homozygous alternative genotype.

If genetic.model="multiplicative", we assume that the the GRR for the homozygous alternative genotype is the square of the GRR for the heterozygous genotype.

If genetic.model="dominant", we assume that the GRR for the heterozygous genotype and the GRR for the homozygous alternative genotype are equal.

If genetic.model="recessive", we assume that the GRR for the heterozygous genotype is equal to 1: the GRR given is the one associated to the homozygous alternative genotype.

GRR.matrix.del contains GRR values as if all variants are deleterious. These values will be used only for the proportion p.causal of variants that will be sampled as causal.

If selected.controls = T, genotypic frequencies in the control group are computed from genotypic frequencies in the cases groups and the prevalence of the disease. If FALSE, genotypic frequencies in the control group are computed from allelic frequencies under Hardy-Weinberg equilibrium.

The files Kryukov or GnomADgenes available with the package Ravages can be used as the argument genes.maf.

If GRR.matrix.del (or GRR.matrix.pro) has been generated using the function GRR.matrix, the arguments genes.maf and select.gene should have the same value as in GRR.matrix.

Only non-monomorphic variants are kept for the simulations.

Causal variants that have been sampled in each group of individuals are indicated in x@ped$Causal.

Value

A bed.matrix with as much columns (variants) as replicates*number of variants. The field x@snps$genomic.region contains the replicate number and the field x@ped$pheno contrains the group of each individual, "0" being the controls group.

See Also

GRR.matrix, Kryukov, GnomADgenes, rbm.GRR.power

Examples

#GRR values calculated with the SKAT formula
GRR.del <- GRR.matrix(GRR = "SKAT", genes.maf = Kryukov, 
                      n.case.groups = 2, select.gene = "R1",
                      GRR.multiplicative.factor=2)
                              
#Simulation of one group of 1,000 controls and two groups of 500 cases, 
#each one with a prevalence of 0.001
#with 50% of causal variants, 5 genomic regions are simulated.
x <- rbm.GRR(genes.maf = Kryukov, size = c(1000, 500, 500), 
             prev = c(0.001, 0.001), GRR.matrix.del = GRR.del, 
             p.causal = 0.5, p.protect = 0, select.gene="R1",
             same.variant = FALSE, 
             genetic.model = "multiplicative", replicates = 5)

Ravages documentation built on April 1, 2023, 12:08 a.m.