inbr_gcta: 2011 GCTA inbreeding estimator III

View source: R/inbr_gcta.R

inbr_gctaR Documentation

2011 GCTA inbreeding estimator III

Description

This function calculates the biased GCTA inbreeding estimator III described in Yang et al. (2011). Though these estimates (MOR version) were the basis of the GRM diagonal according to that paper, the GCTA software history shows that this exact estimator was abandoned in version 0.93.0 (8 Jul 2011) in favor of kinship_std() (also MOR version), which remains in use as of writing (2022).

Usage

inbr_gcta(
  X,
  n = NA,
  mean_of_ratios = FALSE,
  loci_on_cols = FALSE,
  mem_factor = 0.7,
  mem_lim = NA,
  m_chunk_max = 1000
)

Arguments

X

The genotype matrix (BEDMatrix, regular R matrix, or function, same as popkin).

n

The number of individuals. Required if X is a function, ignored otherwise.

mean_of_ratios

The estimator can be computed in two broad forms. If FALSE (default) the ratio-of-means (ROM) version is computed, which behaves more favorably and has a known asymptotic bias. If TRUE, the mean-of-ratios (MOR) version is computed, which is more variable and has an uncharacterized bias, but is most common in the literature.

loci_on_cols

Determines the orientation of the genotype matrix (by default, FALSE, loci are along the rows). If X is a BEDMatrix object, the input value is ignored (set automatically to TRUE internally).

mem_factor

Proportion of available memory to use loading and processing genotypes. Ignored if mem_lim is not NA.

mem_lim

Memory limit in GB, used to break up genotype data into chunks for very large datasets. Note memory usage is somewhat underestimated and is not controlled strictly. Default in Linux and Windows is mem_factor times the free system memory, otherwise it is 1GB (OSX and other systems).

m_chunk_max

Sets the maximum number of loci to process at the time. Actual number of loci loaded may be lower if memory is limiting.

Value

Inbreeding estimates

See Also

GCTA 2011 GRM estimator ROM limit kinship_gcta_limit(). The limit of inbr_gcta with mean_of_ratios = FALSE is given by popkin::inbr(kinship_gcta_limit(true_kinship)).

Standard kinship estimator kinship_std() and the limit of the ROM version kinship_std_limit().

GCTA software, including history/update log. https://yanglab.westlake.edu.cn/software/gcta/#Download

Examples

# dimensions of simulated data
n_ind <- 100
m_loci <- 1000
n_data <- n_ind * m_loci

# missingness rate
miss <- 0.1

# simulate ancestral allele frequencies
# uniform (0,1)
# it'll be ok if some of these are zero
p_anc <- runif(m_loci)

# simulate some binomial data
X <- rbinom(n_data, 2, p_anc)

# sprinkle random missingness
X[ sample(X, n_data * miss) ] <- NA

# turn into a matrix
X <- matrix(X, nrow = m_loci, ncol = n_ind)

# estimate inbreeding
# ... ROM version (see Ochoa and Storey (2021)).
inbr_gcta_rom <- inbr_gcta(X)
# ... MOR version (from Yang et al. (2011)).
inbr_gcta_mor <- inbr_gcta(X, mean_of_ratios = TRUE)


OchoaLab/popkinsuppl documentation built on May 17, 2022, 9:50 a.m.