family_sim_qtl: Simulate quantitative trait from simulated families
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

family_sim_qtl

R Documentation

Simulate quantitative trait from simulated families

Description

This function is designed to pair with the family_sim_genos function to simulate a quantitative trait from a set of simulated families. However, it could also be used to simulate a quantitative trait from observed data. The function is currently only designed to simulate a single trait controlled by n loci with phenotypic variance, Vp, the sum of the additive, Va, and environmental, Ve, variances.

Usage

family_sim_qtl(
  famGenos,
  numLoci = NULL,
  qtlNames = NULL,
  additiveVar,
  environVar,
  effectSizes = 1
)

Arguments

`famGenos`	Data.table: Long-format, must contain the columns `$SAMPLE` of sample IDs, `$LOCUS` of locus IDs, and `$GT` of genotypes.
`numLoci`	Integer: The number of loci contributing to the trait. These loci are drawn from random in the dataset, `famGenos`. Therefore, it is essential the `numLoci` is <= the total unique loci in `famGenos`. Alternate parameterisation is to specify the names of the loci to use with `qtlNames`.
`qtlNames`	Character: A character vector of loci contributing to the trait. These loci must be present in the dataset, `famGenos`. Alternate parameterisation is to specify the number of loci to use with `numLoci`.
`additiveVar`	Numeric: The additive genetic variance.
`environVar`	Numeric: The environmental variance.
`effectSizes`	Numeric: A vector of effect sizes (positive values only) to assign to AA homozygous genotype. Default = 1, that is, all QTLs will have the same effect size.

Details

Note, the current implementation of this function generates a specific additive + environmental variance for a trait, given a set of allele frequencies and the number/identify of loci of interest. The phenotypes and genotypic values simulated are only relevant to the input sample only.

A total of n == numLoci loci from famGenos are drawn at random to underpin the quantitative trait. It is assumed that all loci contribute equally to the trait, i.e., Va/n. The simulation starts by first fitting genotype AA with a genetic value of +1, Aa with a value of 0, and aa with a value of -1. Because the additive genetic variance is also dependent on alelle frequencies:

Va = 2pq[a + d(q - p)]^2

The genetic values are then modified relative to the allele frequencies to ensure the total additivie genetic variance in the sample sum to the specified Va. The phenotype per individual is the sum of their genotypic values plus randomly drawn environmental deviation (rnorm(.., mean=0, sd=sqrt(Ve))).

When assigning variable effect sizes to argument effectSizes, the user needs to generate a vector of values. For example, using the rgamma or the rexp functions. Note, these represent the homozygous AA genotype effect sizes, so the aa genotype effect sizes will be the negative of these values.

Value

A list with the indexes $trait and $loci is returned, containing information on the traits and the underpinning loci.

The index $trait contains a data.table with the following columns:

$SAMPLE, the sample IDs.
$G, the additive genetic values, with variance Va.
$E, the environmental values, with variance Ve.
$P, the phenoypic values, G + E.

The index $loci contains a data.table with the following columns:

$LOCUS, the locus IDs.
$FREQ, the frequency of the focal allele.
$A.VAL, the additive genetic value for this locus.

References

Falconer and MacKay (1996) Introduction to Quantitative Genetics, 3rd Ed., chapter 8, page 129.

Examples

library(genomalicious)
data(data_Genos)

# Subset Pop1 genotypes
genosPop1 <- data_Genos[POP=='Pop1', c('SAMPLE', 'LOCUS', 'GT')]

# Get the allele frequencies for Pop1
freqsPop1 <- genosPop1[, .(FREQ=sum(GT)/(length(GT)*2)), by=LOCUS]

# Simulate 100 familial relationships of each class
simFamily <- family_sim_genos(
   freqData=freqsPop1,
   locusCol='LOCUS',
   freqCol='FREQ',
   numSims=100,
   returnParents=TRUE,
   returnPedigree=TRUE
)

# Take a look at the focal pairs
simFamily$focal.pairs

# Take a look at the parentals
simFamily$parents

# Take a look at the pedigree
simFamily$pedigree

### THE OBSERVED GENOMIC RELATIONSHIPS MATRIX
library(sommer)

# Note, for sommer, we have to adjust genotyeps to range from -1 to 1.

# A genotype matrix for the focal pairs
obsGenosMat <- genosPop1 %>%
  copy %>%
  .[, GT:=GT-1] %>%
  DT2Mat_genos()

# Calculate the GRM
obsGRM <- sommer::A.mat(obsGenosMat)

### THE SIMULATED GENOMIC RELATIONSHIPS MATRIX
# Convert simulated families into a genotype matrix
simGenosMat <- simFamily$focal.pairs %>%
  copy %>%
  .[, GT:=GT-1] %>%
  DT2Mat_genos()

# Calculate the GRM
simGRM <- sommer::A.mat(simGenosMat)

### COMPARE THE OBSERVED AND SIMULATED
relComp <- family_sim_compare(
   simGRM=simGRM,
   obsGRM=obsGRM,
   look='classic'
)

# The data
relComp$data

# Simulated dataset
relComp$data[!is.na(SIM)]

# The observed dataset
relComp$data[is.na(SIM)]

# Plot of relatedness values. Dashed lines denote relatedness
# values of 0, 0.0625, 0.125, 0.25, and 0.5, which are the theoretical
# expectations for unrelated individuals, half-cousins, cousins, half-siblings,
# and siblings/parent-offspring, respectively.
# You will note a large variance in the expected values, which
# is not surprising for this very small SNP dataset (200 loci).
relComp$plot

### SIMULATE A QUANTITATIVE TRAIT

# Combine the focal pairs and parentals, and simualte a trait controlled by
# 100 loci with Va = 1, and Ve = 1, with equal effect sizes.
simQTL.equal <- family_sim_qtl(
  famGenos=rbind(simFamily$focal.pairs, simFamily$parents),
  numLoci=100, additiveVar=1, environVar=1, effectSizes=1
  )

# The trait values for equal effect sizes.
simQTL.equal$trait

# The locus values to equal effect sizes.
simQTL.equal$loci

# This time, let's use vairable effect sizes that follow a gamma
# distribution of effects.
eff_sizes <- rgamma(5000, shape=0.5, scale=0.2)

# Fit with the same loci as used above.
simQTL.variable <- family_sim_qtl(
  famGenos=rbind(simFamily$focal.pairs, simFamily$parents),
  qtlNames=simQTL.equal$loci$LOCUS,
  additiveVar=1, environVar=1, effectSizes=eff_sizes
  )

# The trait values
simQTL.variable$trait

# The locus values
simQTL.variable$loci

# Compare the values
plot(simQTL.equal$trait$P, simQTL.variable$trait$P)

# But they both have the desired variances
var(simQTL.equal$trait$G)
var(simQTL.variable$trait$G)
var(simQTL.equal$trait$E)
var(simQTL.variable$trait$E)

j-a-thia/genomalicious documentation built on April 13, 2025, 9:41 a.m.

j-a-thia/genomalicious index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

family_sim_qtl: Simulate quantitative trait from simulated families
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Simulate quantitative trait from simulated families

Description

Usage

Arguments

Details

Value

References

Examples

Related to family_sim_qtl in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious A smorgasbord of R functions for population genomic analyses

family_sim_qtl: Simulate quantitative trait from simulated families In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Simulate quantitative trait from simulated families

Description

Usage

Arguments

Details

Value

References

Examples

Related to family_sim_qtl in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

family_sim_qtl: Simulate quantitative trait from simulated families
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses