kinshipPairs: Extract pairs of individuals matching certain kinship...

View source: R/utils.R

kinshipPairsR Documentation

Extract pairs of individuals matching certain kinship criteria

Description

The kinshipPairs function allows to extract pairs of individuals matching a user-defined kinship condition (e.g. individuals with a kinship larger than 0.0625). Such sets of paired individuals (along with paired unrelated values) would enable a familial resemblance analysis on quantitative traits (Ziegler 2010) (see examples below for details).

By default, kinshipPairs returns all pairs of individuals for which the condition on the kinship matrix matches (e.g. all pairs of individuals with a kinship coefficient larger than or equal to 0.25). Individuals can thus be reported multiple times (see examples below). Parameter duplicates can be used to define a strategy to avoid such duplicated IDs. Supported are:

  • duplicates = "keep": the default, return all values.

  • duplicates = "first": report only the first pair of individuals for each individual ID.

  • duplicates = "last": report only the last pair of individuals for each individual ID.

  • duplicates = "random": randomly select one pair of individuals for each individual ID.

For any setting different than duplicates = "keep" each individual will only be listed once in the resulting matrix.

Usage

kinshipPairs(
  x,
  condition = function(x) x >= 0.25,
  duplicates = c("keep", "first", "last", "random"),
  id = NULL,
  family = NULL
)

Arguments

x

A FAData object (or object inheriting from that).

condition

A function defining how individuals should be selected based on the object's kinship matrix. The default is to select all individuals with a kinship >= 0.25. Note that the diagonal of the kinship matrix (i.e. the kinship of individuals with itself) is always skipped, so no additional criteria is needed to avoid self-pairs.

duplicates

character(1) defining how to deal with duplicated IDs in the result returned by the function. See function description and examples below for more details. Defaults to duplicates = "keep" returning all pairs of IDs matching condition.

id

optional identifiers of subsets of individuals on which the pairs should be defined. Defaults to id = NULL hence the full data set is considered.

family

optional family identifiers if pairs should only defined for selected families. Defaults to family = NULL hence the full data set is considered.

Value

A two column matrix with the IDs (colnames/rownames of the kinship matrix or as defined in x$id) of the pairs. If duplicates is either "first", "last" or "random" each ID is only returned once (i.e. no ID is reported more than one time).

Author(s)

Johannes Rainer

References

Ziegler A., Koenig I. R. (2010). Familiality, Heristability, and Segregation Analysis. In A Statistical Approach to Genetic Epidemiology: With Access to E-Learning Platform by Friedrich Pahlke, Second Edition. doi: 10.1002/9783527633654.ch6.

See Also

PedigreeUtils for other pedigree utility functions.

Examples


##########################
##
##  Create a new FAData object
##
## Load the Minnesota Breast Cancer record and subset to the
## first families.
data(minnbreast)
mbsub <- minnbreast[minnbreast$famid %in% 1:20, ]
mbped <- mbsub[, c("famid", "id", "fatherid", "motherid", "sex")]
## Renaming column names
colnames(mbped) <- c("family", "id", "father", "mother", "sex")
## Defining the optional argument age.
Age <- mbsub$endage
names(Age) <- mbsub$id
## Create the object
fad <- FAData(pedigree=mbped, age=Age)

## Getting all pairs of individuals with a kinship coefficient >= 0.25
## keeping all duplicates
rel_pairs <- kinshipPairs(fad)
head(rel_pairs)
## As we see, we have multiple times the individual 1 etc.

## For an actual correlation analysis it would be better to drop duplicates.
## Below we randomly select individual pairs if they occurr multiple times
rel_pairs <- kinshipPairs(fad, duplicates = "random")
head(rel_pairs)

## In addition we extract pairs of individuals that are much less related.
## For this examples we consider all individuals with a kinship
## coefficient < 0.03125 (second cousin) to be *unrelated*.
unrel_pairs <- kinshipPairs(fad, duplicates = "random",
    condition = function(z) z < 0.03125)
head(unrel_pairs)

## For a familial resemblance analysis we can now calculate the correlation
## coefficient of a quantitative trait between pairs of related individuals
## and compare that with the correlation coefficient calculated on unrelated
## individuals. For our toy example we use the participant's age, since we
## don't have any other quantitative values available.
cor_rel <- cor(age(fad)[rel_pairs[, 1]], age(fad)[rel_pairs[, 2]],
    use = "pairwise.complete.obs")

cor_unrel <- cor(age(fad)[unrel_pairs[, 1]], age(fad)[unrel_pairs[, 2]],
    use = "pairwise.complete.obs")
cor_rel
cor_unrel

## We don't see a clear difference in the correlation, thus, the age (as
## expected) has no familial component.

EuracBiomedicalResearch/FamAgg documentation built on March 12, 2023, 7:45 p.m.