kinship_last_gen: Calculate kinship matrix for last generation of a pedigree...
In simfam: Simulate and Model Family Pedigrees with Structured Founders

kinship_last_gen

R Documentation

Calculate kinship matrix for last generation of a pedigree with structured founders

Description

A wrapper around the more general kinship_fam(), specialized to save memory when only the last generation is desired (kinship_fam() returns kinship for the entire pedigree in a single matrix). This function assumes that generations are non-overlapping (met by the output of sim_pedigree()), in which case each generation g can be drawn from generation g-1 data only. That way, only two consecutive generations need be in memory at any given time. The partitioning of individuals into generations is given by the ids parameter (again matches the output of sim_pedigree()).

Usage

kinship_last_gen(kinship, fam, ids, missing_vals = c("", 0))

Arguments

`kinship`	The kinship matrix of the founders. This matrix must have column and row names that identify each founder (matching codes in `fam$id`). Individuals may be in a different order than `fam$id`. Extra individuals in `kinship` but absent in `fam$id` will be silently ignored. A traditional pedigree calculation would use `kinship = diag(n)/2` (plus appropriate column/row names), where `n` is the number of founders, to model unrelated and outbred founders. However, if `kinship` measures the population kinship estimates between founders, the output is also a population kinship matrix (which combines the structural/ancestral and local/pedigree relatedness values into one).
`fam`	The pedigree data.frame, in plink FAM format. Only columns `id`, `pat`, and `mat` are required. `id` must be unique and non-missing. Founders must be present, and their `pat` and `mat` values must be missing (see below). Non-founders must have both their parents be non-missing. Parents must appear earlier than their children in the table.
`ids`	A list containing vectors of IDs for each generation. All these IDs must be present in `fam$id`. If IDs in `fam` and `ids` do not fully agree, the code processes the IDs in the intersection, which is helpful when `fam` is pruned but `ids` is the original (larger) set.
`missing_vals`	The list of ID values treated as missing. `NA` is always treated as missing. By default, the empty string (”) and zero (0) are also treated as missing (remove values from here if this is a problem).

Value

The kinship matrix of the last generation (the intersection of ids[ length(ids) ] and fam$id). The columns/rows of this matrix are last-generation individuals in the order that they appear in fam$id.

Examples

# A small pedigree, two parents and two children.
# A minimal fam table with the three required columns.
# Note "mother" and "father" have missing parent IDs, while children do not
library(tibble)
fam <- tibble(
  id = c('father', 'mother', 'child', 'sib'),
  pat = c(NA, NA, 'father', 'father'),
  mat = c(NA, NA, 'mother', 'mother')
)
# need an `ids` list separating the generations
ids <- list( c('father', 'mother'), c('child', 'sib') )

# Kinship of the parents, here two unrelated/outbred individuals:
kinship <- diag(2)/2
# Name the parents with same codes as in `fam`
# (order can be different)
colnames( kinship ) <- c('mother', 'father')
rownames( kinship ) <- c('mother', 'father')
# For a clearer example, make the father slightly inbred
# (a self-kinship value that exceeds 1/2):
kinship[2,2] <- 0.6

# calculate the kinship matrix of the children
kinship2 <- kinship_last_gen( kinship, fam, ids )
kinship2

simfam documentation built on Jan. 10, 2023, 1:06 a.m.