simfam-package: Simulate and Model Family Pedigrees With Structured Founders

simfam-packageR Documentation

Simulate and Model Family Pedigrees With Structured Founders

Description

The simfam package is for constructing large random families—with population studies in mind—with realistic constraints including avoidance of close relative pairings but otherwise biased for closest pairs in a 1-dimensional geography. There is also code to draw genotypes across the pedigree starting from genotypes for the founders. Our model allows for the founders to be related or structured—which arises in practice when there are different ancestries among founders—and given known parameters for these founders, we provide code to calculate the true kinship and expected admixture proportions of all descendants.

Author(s)

Maintainer: Alejandro Ochoa alejandro.ochoa@duke.edu (ORCID)

See Also

Useful links:

Examples

# create some toy data for examples
# data dimensions
n <- 10 # number of individuals per generation
G <- 3 # number of generations
m <- 100 # number of loci
K <- 3 # number of ancestries

# start by simulating pedigree
data <- sim_pedigree( n, G )
# creates FAM table (describes pedigree, most important!)
fam <- data$fam
# lists of IDs split by generation
ids <- data$ids
# and local kinship of last generation
kinship_local_G <- data$kinship_local

# prune pedigree to speed up simulations/etc by removing individuals without
# descendants among set of individuals (here it's last generation)
fam <- prune_fam( fam, ids[[ G ]] )

# to model descendants, first need to setup founders with dummy toy data
# 1) random genotypes
X_1 <- matrix( rbinom( n*m, 2, 0.5 ), m, n )
colnames( X_1 ) <- ids[[ 1 ]] # need IDs to match founders
# 2) kinship of unrelated/outbred individuals
# (but in practice can be any structure!)
kinship_1 <- diag( n ) / 2
colnames( kinship_1 ) <- ids[[ 1 ]] # need IDs to match founders
rownames( kinship_1 ) <- ids[[ 1 ]] # ditto
# 3) construct dummy admixture proportions
admix_proportions_1 <- matrix( runif( n * K ), n, K )
# normalize so rows sum to 1
admix_proportions_1 <- admix_proportions_1 / rowSums( admix_proportions_1 )
rownames( admix_proportions_1 ) <- ids[[ 1 ]] # need IDs to match founders

# now construct/draw/propagate information across the pedigree!

# 1) draw genotypes through pedigree, starting from founders!
X <- geno_fam( X_1, fam )
# version for last generation only, which uses less memory
X_G <- geno_last_gen( X_1, fam, ids )

# 2) calculate kinship through pedigree, starting from kinship of founders!
kinship <- kinship_fam( kinship_1, fam )
# version for last generation only, which uses less memory
kinship_G <- kinship_last_gen( kinship_1, fam, ids )

# 3) calculate expected admixture proportions through pedigree, starting from admixture of founders!
admix_proportions <- admix_fam( admix_proportions_1, fam )
# version for last generation only, which uses less memory
admix_proportions_G <- admix_last_gen( admix_proportions_1, fam, ids )



simfam documentation built on Jan. 10, 2023, 1:06 a.m.