admix_label_cols: Label ancestries based on best match to individual labels

View source: R/admix_label_cols.R

admix_label_colsR Documentation

Label ancestries based on best match to individual labels

Description

Returns labels for each ancestry (columns) of an admixture matrix which is the best matching label among the average individual (rows) of each subpopulation. More specifically, each ancestry is associated to the subpopulation label in which its admixture proportion was the highest averaging over all individuals from that subpopulation. If there are two or more ancestries that match to the same label, these are made unique by appending its order of appearance (if the label is "A", then the first column that matches to it is labeled "A1", the next one "A2", etc).

Usage

admix_label_cols(Q, labs)

Arguments

Q

The admixture proportions matrix.

labs

Subpopulation labels for individuals (rows of Q).

Value

The best label assignments for the ancestries (columns of Q), made unique by indexes if there are overlaps.

See Also

admix_order_cols() to automatically order ancestries given ordered individuals.

plot_admix() for plotting admixture matrices.

Examples

# toy admixture matrix with labels for individuals/rows that match well with ancestry/columns
Q <- matrix(
    c(
        0.1, 0.8, 0.1,
        0.1, 0.7, 0.2,
        0.0, 0.4, 0.6,
        0.0, 0.3, 0.7,
        0.9, 0.0, 0.1
    ),
    nrow = 5,
    ncol = 3,
    byrow = TRUE
)
labs <- c('X', 'X', 'Y', 'Y', 'Z')

# to calculate matches and save as column names, do this:
colnames( Q ) <- admix_label_cols( Q, labs )

# expected column names: c('Z', 'X', 'Y')


popkin documentation built on Jan. 7, 2023, 1:26 a.m.