mc_sim_pheno | R Documentation |
Simulate phenotypes with featured and residual causal variants for multiple crosses. It is possible to add a dilution, with noncausal SNP in the genomic feature.
mc_sim_pheno(
X,
map,
n_QTL = 100,
QTL_distribution = "random",
min_distance = NULL,
prop_f = 0,
Sp = 100,
mu = 50,
h2 = 0.5,
h2f = 0,
scale_mk = FALSE,
theta = 0.95,
kj = "decrease_random",
k_val = NULL,
corr_f = 0.5,
dilution_percent = 0,
type_effect = "equal",
s_effect = 0.9
)
X |
|
map |
genetic marker map: marker id, chromosome, genetic position, physical position. |
n_QTL |
number of QTLs (causal variants) to be divided into featured and residual variants. Default = 100 |
QTL_distribution |
Fixed to "random" (for the moment) |
min_distance |
Minimum distance in kb between two selected QTLs. Default = NULL. Not used now |
prop_f |
Proportion of featured causal variants. Default = 0.5 |
Sp |
Phenotypic variance. Default = 100 |
mu |
Grand mean. Default = 50 |
h2 |
heritability. Default = 0.5 |
h2f |
Proportion of genetic variance due to the featured (annotated) markers. Default = 0.5 |
scale_mk |
|
theta |
Parameter values of the marker distribution function
( |
kj |
Character string specifying the distribution of proportion by the causal variant in the featured and residual class. One of "constant", "decrease_along_genome", "decrease_random", "user_specified". Default = "decrease_random" |
k_val |
|
corr_f |
|
dilution_percent |
|
type_effect |
Character string specifying le type d'effet des marqueurs selon les croisements. "equal" pour que tous les croisements aient le même effet. "series" pour que les effets des croisements suivent une série allélique avec le paramètre s_effect. (s, s^2, s^3...). Default = "equal". |
s_effet |
|
The phenotype are simulated following this sequence:
1. Underlying model: P_{i} = \mu + G_i + \epsilon_{i}
(unreplicated genotypes) with variance structure
\sigma_{p}^2 = \sigma_{g}^2 + \sigma_{\epsilon}^2
with \sigma_{g}^2 = \sigma_{g_f}^2 + \sigma_{g_r}^2
\sigma_{g_f}^2
: genetic variance due to functional SNP
\sigma_{g_r}^2
: residual genetic variance
The latter are linked through the following relationship h_f^2 = \frac{\sigma_{g_f}^2}{\sigma_{g}^2} = \frac{\sigma_{g_f}^2}{\sigma_{g_f}^2 + \sigma_{g_r}^2}
2. Fix \sigma_{p}^2 = 100
and \mu = 50
3. Get the genetic variance given assumed variance \sigma_{g}^2 = h^2*\sigma_{p}^2
4. Get the genetic variance under functional annotations \sigma_{g_f}^2 = h_f^2*\sigma_{g}^2
5. Get the residual genetic variance \sigma_{g_r}^2 = \sigma_{g}^2 - \sigma_{g_f}^2
6. Select n_{SNP}
that are spread into the f
and the r
class to form X_{f}
and X_{r}
given the percentage of markers associated to each category.
############ MODIFIER CA
7. Calculate for each selected marker the minor allele frequency p_j
8. Determine the marker effect distribution using the following function
f(\theta, j) = \theta^{j} \quad j=1, ..., n_{SNP}
9. Determine the proportion of genetic variance accounted for associated to each marker. This is done for each class of effect separately. The values of \theta^{j}
are randomly sampled from the general distribution to respect the general distribution of the marker effects.
k_j(f) = \frac{\theta^{j}}{\sum_{m=1}^{n_{SNP(f)}}{\theta^{m}}}
10. For each selected SNP, determine the marker effect to meet the determined \sigma_{g_f}^2
and \sigma_{g_r}^2
using the formula proposed by Mollandin et al. (2022)
\beta_{j(f)} = \pm (\frac{1}{2}) \sqrt{\frac{k_{j(f)}*\sigma_{g_f}^2}{2pq}}
11. Form the simulated phenotypes
\tilde{y} = \mu + X_{s(f)}'\beta_{j(f)} + X_{s(r)}'\beta_{j(r)} + \epsilon_{i}
where X_{s(f)}
(X_{s(r)}
) is the centered and scaled marker matrix containing the f
(r
) markers.
and \epsilon_i \sim N(0, \sigma_{\epsilon}^2)
where \sigma_{\epsilon}^2 = (1-h^2) * \sigma_{p}^2
##############
12. Annotation dilution
n_{QTL.polluted} = n_{QTL} * \frac{dilution.percent}{100}
Dilution percent : if there is 100 causal markers to be divided into featured and residual variants, 10 to 10 markers non causal added to the causal featured ones.
list containing the following object
d_y: data.frame with: simulated phenotype (y_sim), featured causal variants contribution (y_f), residual causal variants contribution (y_r), error contribution (e_i)
mk_sel_f: selected featured causal variants
mk_sel_r: selected residual causal variants
X_f: marker matrix of the featured causal variants (if dilution_percent != 0, marker matrix with also noncausal variants featured selected)
X_r: marker matrix of the residual causal variants
Bf: Additive effect of the featured causal variants
Br: Additive effect of the residual causal variants
# Generate genotype data
data("geno_par")
rownames(geno_par) <- paste0('P', 1:nrow(geno_par))
data("EUNAM_map")
rownames(map) <- map[, 1]
# single cross
cross_scheme <- cross_scheme_NAM(np = 5)
geno_par <- geno_par[1:5, ]
geno <- sim_mpp_cross(geno_par = geno_par, map = map,
cross_scheme = cross_scheme,
n_ind_cr = rep(100, nrow(cross_scheme)))
# type effect : equal
y_sim <- mc_sim_pheno(X = geno$geno_num_IBD, map = map,
type_effect = "equal")
var(y_sim$d_y$y_r)
# test simulation consistency
res <- matrix(NA, nrow = 100, ncol = 4)
for(i in 1:100){
d <- mc_sim_pheno(X = geno$geno_num_IBD, map = map,
type_effect = "equal")
res[i, ] <- diag(var(d$d_y))
}
colnames(res) <- c("Sp", "Sgf", "Sg", "Se")
res <- res[, -2]
res <- data.frame(res)
boxplot(res)
abline(h = 100, col = "green")
abline(h = 50, col = "blue")
# type effect : series
y_sim <- mc_sim_pheno(X = geno$geno_num_IBD, map = map,
type_effect = "series", s_effect = 0.9)
var(y_sim$d_y$y_r)
# test simulation consistency
res <- matrix(NA, nrow = 100, ncol = 4)
for(i in 1:100){
d <- mc_sim_pheno(X = geno$geno_num_IBD, map = map,
type_effect = "series", s_effect = 0.9)
res[i, ] <- diag(var(d$d_y))
}
colnames(res) <- c("Sp", "Sgf", "Sg", "Se")
res <- res[, -2]
res <- data.frame(res)
boxplot(res)
abline(h = 100, col = "green")
abline(h = 50, col = "blue")
# with 10% of dilution (not yes) :
# y_sim_dil <- mc_sim_pheno(X = geno$geno_num_IBD, map = map,
dilution_percent = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.