View source: R/rescale_popkin.R

rescale_popkin | R Documentation |

If you already have a population kinship matrix, and you desire to estimate the kinship matrix in a subset of the individuals, you could do it the slow way (reestimating starting from the genotypes of the subset of individuals) or you can do it the fast way: first subset the kinship matrix to only contain the individuals of interest, then use this function to rescale this kinship matrix so that the minimum kinship is zero. This rescaling is required when subsetting results in a more recent Most Recent Common Ancestor (MRCA) population compared to the original dataset (for example, if the original data had individuals from across the world but the subset only contains individuals from a single continent).

rescale_popkin(kinship, subpops = NULL, min_kinship = NA)

`kinship` |
An |

`subpops` |
The length- |

`min_kinship` |
A scalar kinship value to define the new zero kinship. |

This function rescales the input `kinship`

matrix so that the value `min_kinship`

in the original kinship matrix becomes zero, using the formula
`kinship_rescaled = ( kinship - min_kinship ) / ( 1 - min_kinship )`

.
This is equivalent to changing the ancestral population of the data.
If subpopulation labels `subpops`

are provided (recommended), they are used to estimate `min_kinship`

using the function `popkin_A_min_subpops()`

, which is the recommended way to set the MRCA population correctly.
If both `subpops`

and `min_kinship`

are provided, only `min_kinship`

is used.
If both `subpops`

and `min_kinship`

are omitted, the function sets `min_kinship = min( kinship )`

.

The rescaled `n`

-by-`n`

kinship matrix, with the desired level of relatedness set to zero.

# Construct toy data X <- matrix(c(0,1,2,1,0,1,1,0,2), nrow=3, byrow=TRUE) # genotype matrix subpops <- c(1,1,2) # subpopulation assignments for individuals subpops2 <- 1:3 # alternate labels treat every individual as a different subpop # NOTE: for BED-formatted input, use BEDMatrix! # "file" is path to BED file (excluding .bed extension) ## library(BEDMatrix) ## X <- BEDMatrix(file) # load genotype matrix object # suppose we first estimate kinship without subpopulations, which will be more biased kinship <- popkin(X) # calculate kinship from genotypes, WITHOUT subpops # then we visualize this matrix, figure out a reasonable subpopulation partition # now we can adjust the kinship matrix! kinship2 <- rescale_popkin(kinship, subpops) # prev is faster but otherwise equivalent to re-estimating kinship from scratch with subpops: # kinship2 <- popkin(X, subpops) # can also manually set the level of relatedness min_kinship we want to be zero: min_kinship <- min(kinship) # a naive choice for example kinship2 <- rescale_popkin(kinship, min_kinship = min_kinship) # lastly, omiting both subpops and min_kinship sets the minimum value in kinship to zero kinship3 <- rescale_popkin(kinship2) # equivalent to both of: # kinship3 <- popkin(X) # kinship3 <- rescale_popkin(kinship2, min_kinship = min(kinship))

popkin documentation built on Jan. 7, 2023, 1:26 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.