family_sim_compare: Compare simulated and observed inferred genetic relationships
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

family_sim_compare

R Documentation

Compare simulated and observed inferred genetic relationships

Description

This function can help compare observed estimates of relatedness to expected values of relatedness for different familial relationships given a set of loci and population alelle frequencies. It returns a data table combining the simulated and observed estiamtes and a plot of density curves overlaying the distribution of observed relatedness estimates on top of simulated values.

Usage

family_sim_compare(
  simGRM,
  obsGRM,
  plotColous = NULL,
  look = "ggplot",
  legendPos = "right",
  curveAlpha = 0.7,
  curveFill = NULL,
  curveOutline = NULL,
  facetFamily = FALSE
)

Arguments

`simGRM`	Matrix: The simulated GRM using the individuals generated from `family_sim_genos`. See Details.
`obsGRM`	Matrix: The observed GRM using the observed individuals. See Details.
`look`	Character: The look of the plot. Default = `'ggplot'`, the typical gray background with gridlines produced by `ggplot2`. Alternatively, when set to `'classic'`, produces a base R style plot.
`legendPos`	Character: Where should the legend be positioned? Default is `'top'`, but could also be one of, `'right'`, `'bottom'`, `'left'`, or `'none'`.
`curveAlpha`	Numeric: A value between 0 and 1 to set the transparency of the density curves. Default = 0.7.
`curveFill`	Character: A vector of colours to fill density curves, but is an optional argument. Default = `NULL`. If specified, must be a length of 5, with colours corresponding to the levels 'Unrelated', 'Cousins', 'Half-siblings', 'Siblings', and 'Observed', in that order.
`curveOutline`	Character: A vector of colours to for density curve outlines, but is an optional argument. Default = `NULL`. If specified, must be a length of 5, with colours corresponding to the levels 'Unrelated', 'Cousins', 'Half-siblings', 'Siblings', and 'Observed', in that order.
`facetFamily`	Logical: Should the plot be faceted by family relationship? Default is FALSE.
`numSims`	Integer: The number of simulated individuals for each family relationship. Default is 100.

Details

The function takes the output of family_sim_genos and additional calculations of the genetic relationiship matrix (GRM) that are performed by the user on the simulated and observed individuals.

The GRMs for arguments simGRM and obsGRM need to be created by the user with whatever program they want to use to calculate pairwise relatedness among individuals. The same function call should be used ob both datasets. The GRM should be a square matrix with the relatedness of individuals to themselves on the diagonal, and their relatedness to other individuals on the off-diagonal.

Value

Returns two objects. The first is data.table with the following columns:

$SIM, the simulation number for pairs of simulated individuals, or 'NA' for the pairs of observed individuals.
$SAMPLE1, the sample ID for the first individual.
$SAMPLE2, the sample ID for the second individual.
$FAMILY, the familial relationship for simulated individuals.
$RELATE, the estimated relatedness.

The second is a ggplot object which plots density curves for the estimated relatedness values calculated for simulated pairs of unrelated individuals, cousins, half-siblings, and siblings, with the observed relatedness values from the user's dataset overlayed. Dashed lines are used to demarcate the theoretical expected relatedness values for unrelated individuals (0), cousins (0.125), half-siblings (0.25), and siblings (0.5).

Examples

library(genomalicious)
data(data_Genos)

# Subset Pop1 genotypes
genosPop1 <- data_Genos[POP=='Pop1', c('SAMPLE', 'LOCUS', 'GT')]

# Get the allele frequencies for Pop1
freqsPop1 <- genosPop1[, .(FREQ=sum(GT)/(length(GT)*2)), by=LOCUS]

# Simulate 100 families
simFamily <- family_sim_genos(
   freqData=freqsPop1,
   locusCol='LOCUS',
   freqCol='FREQ',
   numSims=100
)

### THE OBSERVED GENOMIC RELATIONSHIPS MATRIX
library(sommer)

# Note, for sommer, we have to adjust genotyeps to range from -1 to 1.

# A genotype matrix for the focal pairs
obsGenosMat <- genosPop1 %>%
  copy %>%
  .[, GT:=GT-1] %>%
  DT2Mat_genos()

# Calculate the GRM
obsGRM <- sommer::A.mat(obsGenosMat)

### THE SIMULATED GENOMIC RELATIONSHIPS MATRIX
# Convert simulated families into a genotype matrix
simGenosMat <- simFamily$focal.pairs %>%
  copy %>%
  .[, GT:=GT-1] %>%
  DT2Mat_genos()

# Calculate the GRM
simGRM <- sommer::A.mat(simGenosMat)

### COMPARE THE OBSERVED AND SIMULATED
relComp <- family_sim_compare(
   simGRM=simGRM,
   obsGRM=obsGRM,
   look='classic'
)

# The data
relComp$data

# Simulated dataset
relComp$data[!is.na(SIM)]

# The observed dataset
relComp$data[is.na(SIM)]

# Plot of relatedness values. Dashed lines denote relatedness
# values of 0, 0.0625, 0.125, 0.25, and 0.5, which are the theoretical
# expectations for unrelated individuals, half-cousins, cousins, half-siblings,
# and siblings, respectively.
# You will note a large variance are the expected values, which
# is not surprising for this very small SNP dataset (200 loci).
relComp$plot

# Take a look at the "known" relationships in the observed dataset using the
# offspring we created.
# Note, whilst values are centered on their theoretical expectations,
# you will probably find the "observed" relatedness might be lower or even higher.
relComp$data[FAMILY=='Half-siblings']$RELATE %>% summary()
relComp$data[FAMILY=='Siblings']$RELATE %>% summary()
relComp$data[FAMILY=='Parent-offspring']$RELATE %>% summary()

j-a-thia/genomalicious documentation built on April 13, 2025, 9:41 a.m.

j-a-thia/genomalicious index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

family_sim_compare: Compare simulated and observed inferred genetic relationships
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Compare simulated and observed inferred genetic relationships

Description

Usage

Arguments

Details

Value

Examples

Related to family_sim_compare in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious A smorgasbord of R functions for population genomic analyses

family_sim_compare: Compare simulated and observed inferred genetic relationships In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses

Compare simulated and observed inferred genetic relationships

Description

Usage

Arguments

Details

Value

Examples

Related to family_sim_compare in j-a-thia/genomalicious...

R Package Documentation

Browse R Packages

We want your feedback!

j-a-thia/genomalicious
A smorgasbord of R functions for population genomic analyses

family_sim_compare: Compare simulated and observed inferred genetic relationships
In j-a-thia/genomalicious: A smorgasbord of R functions for population genomic analyses