recomb_fam: Draw recombination breaks for autosomes from a pedigree

View source: R/recomb_fam.R

recomb_famR Documentation

Draw recombination breaks for autosomes from a pedigree

Description

Create random recombination breaks for all autosomes of all individuals in the provided pedigree FAM table. Recombination lengths follow an exponential distribution with mean of 100 centiMorgans (cM). The output specifies identical-by-descent (IBD) blocks as ranges per chromosome (per individual) and the founder chromosome they arose from (are IBD with). All calculations are in terms of genetic distance (not base pairs), and no genotypes are constructed/drawn in this step.

Usage

recomb_fam(founders, fam, missing_vals = c("", 0))

Arguments

founders

The named list of founders with their chromosomes. For unstructured founders, initialize with recomb_init_founders(). Each element of this list is a diploid individual, which is a list with two haploid individuals named pat and mat, each of which is a list of chromosomes (always identified by number, but may also be named arbitrarily), each of which is a data.frame/tibble with implicit ranges (posg is end coordinates in cM; start is the end of the previous block, zero for the first block) and ancestors anc as strings. For true founders each chromosome may be trivial (each chromosome is a single block with ID equal to itself but distinguishing maternal from paternal copy), but input itself can be recombined (for iterating). This list must have names that identify each founder (matching codes in fam$id). Individuals may be in a different order than fam$id. Extra individuals in founders but absent in fam$id will be silently ignored.

fam

The pedigree data.frame, in plink FAM format. Only columns id, pat, and mat are required. id must be unique and non-missing. Founders must be present, and their pat and mat values must be missing (see below). Non-founders must have both their parents be non-missing. Parents must appear earlier than their children in the table.

missing_vals

The list of ID values treated as missing. NA is always treated as missing. By default, the empty string (”) and zero (0) are also treated as missing (remove values from here if this is a problem).

Value

The list of individuals with recombined chromosomes of the entire fam table, in the same format as founders above. The names of this list correspond to fam$id in that order.

See Also

recomb_init_founders() to initialize founders for this function.

Plink FAM format reference: https://www.cog-genomics.org/plink/1.9/formats#fam

Examples

# The smallest pedigree, two parents and a child.
# A minimal fam table with the three required columns.
# Note "mother" and "father" have missing parent IDs, while "child" does not
library(tibble)
fam <- tibble(
  id = c('father', 'mother', 'child'),
  pat = c(NA, NA, 'father'),
  mat = c(NA, NA, 'mother')
)

# initialize parents with this other function
# Name the parents with same codes as in `fam`
# (order can be different)
ids <- c('mother', 'father')
# simulate three chromosomes with these lengths in cM
lengs <- c(50, 100, 150)
founders <- recomb_init_founders( ids, lengs )

# draw recombination breaks for the whole fam table now:
inds <- recomb_fam( founders, fam )

# This is a length-3 list with names matching fam$id.
# The parent data equals the input (reordered),
# but now there's data to the child too
inds


simfam documentation built on Jan. 10, 2023, 1:06 a.m.