allele_freqs: Compute locus allele frequencies

View source: R/allele_freqs.R

allele_freqsR Documentation

Compute locus allele frequencies

Description

On a regular matrix, this is essentially a wrapper for colMeans() or rowMeans() depending on loci_on_cols. On a BEDMatrix object, the locus allele frequencies are computed keeping memory usage low.

Usage

allele_freqs(
  X,
  loci_on_cols = FALSE,
  fold = FALSE,
  m_chunk_max = 1000,
  subset_ind = NULL
)

Arguments

X

The genotype matrix (regular R matrix or BEDMatrix object). Missing values are ignored in averages.

loci_on_cols

If TRUE, X has loci on columns and individuals on rows; if false (the default), loci are on rows and individuals on columns. If X is a BEDMatrix object, code assumes loci on columns (loci_on_cols is ignored).

fold

If TRUE, allele frequencies are converted to minor allele frequencies. Default is to return frequencies for the given allele counts in X (regardless of whether it is the minor or major allele).

m_chunk_max

BEDMatrix-specific, sets the maximum number of loci to process at the time. If memory usage is excessive, set to a lower value than default (expected only for extremely large numbers of individuals).

subset_ind

Optionally subset individuals by providing their indexes (negative indexes to exclude) or a boolean vector (in other words, the usual ways to subset matrices). Most useful for BEDMatrix inputs, to subset chunks and retain low memory usage.

Value

The vector of allele frequencies, one per locus. Names are set to the locus names, if present.

Examples

# Construct toy data
X <- matrix(
    c(0, 1, 2,
      1, 0, 1,
      1, NA, 2),
    nrow = 3,
    byrow = TRUE
)

# row means
allele_freqs(X)
c(1/2, 1/3, 3/4)

# row means, in minor allele frequencies
allele_freqs(X, fold = TRUE)
c(1/2, 1/3, 1/4)

# col means
allele_freqs(X, loci_on_cols = TRUE)
c(1/3, 1/4, 5/6)


OchoaLab/simtrait documentation built on April 19, 2024, 7:36 p.m.