inbr_gcta: 2011 GCTA inbreeding estimator III
In OchoaLab/popkinsuppl: Supplement to popkin package

View source: R/inbr_gcta.R

inbr_gcta

R Documentation

2011 GCTA inbreeding estimator III

Description

This function calculates the biased GCTA inbreeding estimator III described in Yang et al. (2011). Though these estimates (MOR version) were the basis of the GRM diagonal according to that paper, the GCTA software history shows that this exact estimator was abandoned in version 0.93.0 (8 Jul 2011) in favor of kinship_std() (also MOR version), which remains in use as of writing (2022).

Usage

inbr_gcta(
  X,
  n = NA,
  mean_of_ratios = FALSE,
  loci_on_cols = FALSE,
  mem_factor = 0.7,
  mem_lim = NA,
  m_chunk_max = 1000
)

Arguments

`X`	The genotype matrix (BEDMatrix, regular R matrix, or function, same as `popkin`).
`n`	The number of individuals. Required if `X` is a function, ignored otherwise.
`mean_of_ratios`	The estimator can be computed in two broad forms. If `FALSE` (default) the ratio-of-means (ROM) version is computed, which behaves more favorably and has a known asymptotic bias. If `TRUE`, the mean-of-ratios (MOR) version is computed, which is more variable and has an uncharacterized bias, but is most common in the literature.
`loci_on_cols`	Determines the orientation of the genotype matrix (by default, `FALSE`, loci are along the rows). If `X` is a BEDMatrix object, the input value is ignored (set automatically to `TRUE` internally).
`mem_factor`	Proportion of available memory to use loading and processing genotypes. Ignored if `mem_lim` is not `NA`.
`mem_lim`	Memory limit in GB, used to break up genotype data into chunks for very large datasets. Note memory usage is somewhat underestimated and is not controlled strictly. Default in Linux and Windows is `mem_factor` times the free system memory, otherwise it is 1GB (OSX and other systems).
`m_chunk_max`	Sets the maximum number of loci to process at the time. Actual number of loci loaded may be lower if memory is limiting.

Value

Inbreeding estimates

Examples

# dimensions of simulated data
n_ind <- 100
m_loci <- 1000
n_data <- n_ind * m_loci

# missingness rate
miss <- 0.1

# simulate ancestral allele frequencies
# uniform (0,1)
# it'll be ok if some of these are zero
p_anc <- runif(m_loci)

# simulate some binomial data
X <- rbinom(n_data, 2, p_anc)

# sprinkle random missingness
X[ sample(X, n_data * miss) ] <- NA

# turn into a matrix
X <- matrix(X, nrow = m_loci, ncol = n_ind)

# estimate inbreeding
# ... ROM version (see Ochoa and Storey (2021)).
inbr_gcta_rom <- inbr_gcta(X)
# ... MOR version (from Yang et al. (2011)).
inbr_gcta_mor <- inbr_gcta(X, mean_of_ratios = TRUE)

OchoaLab/popkinsuppl documentation built on May 17, 2022, 9:50 a.m.