kinship.pca: Performs a Principal Component Analysis (PCA) based on a...

View source: R/Kinship_pca.R

kinship.pcaR Documentation

Performs a Principal Component Analysis (PCA) based on a kinship matrix K

Description

Generates a PCA and summary statistics from a given kinship matrix for population structure. This matrix can be a pedigree-based relationship matrix \boldsymbol{A}, a genomic relationship matrix \boldsymbol{G} or a hybrid relationship matrix \boldsymbol{H}. Individual names should be assigned to rownames and colnames. There is additional output such as plots and other data frames to be used on other downstream analyses (such as GWAS).

Usage

kinship.pca(
  K = NULL,
  scale = TRUE,
  label = FALSE,
  ncp = 10,
  groups = NULL,
  ellipses = FALSE
)

Arguments

K

Input of a kinship matrix in full form (n \times n) (default = NULL).

scale

If TRUE the PCA analysis will scale the kinship matrix, otherwise it is used in its original scale (default = TRUE).

label

If TRUE then includes in output individuals names (default = FALSE).

ncp

The number of PC dimensions to be shown in the screeplot, and to provide in the output data frame (default = 10).

groups

Specifies a vector of class factor that will be used to define different colors for individuals in the PCA plot. It must be presented in the same order as the individuals in the kinship matrix (default = NULL).

ellipses

If TRUE, ellipses will will be drawn around each of the define levels in groups (default = FALSE).

Details

It calls function eigen() to obtain eigenvalues and later generate the PCA and the factoextra R package to extract and visualize results.

Value

A list with the following four elements:

  • eigenvalues: a data frame with the eigenvalues and its variances associated with each dimension including only the first ncp dimensions.

  • pca.scores: a data frame with scores (rotated observations on the new components) including only the first ncp dimensions.

  • plot.pca: a scatterplot with the first two-dimensions (PC1 and PC2) and their scores.

  • plot.scree: a barchart with the percentage of variances explained by the ncp dimensions.

Examples

# Get G matrix.
G <- G.matrix(M = geno.apple, method = "VanRaden")$G
G[1:5, 1:5]

# Perform the PCA.
G_pca <- kinship.pca(K = G, ncp = 10)
ls(G_pca)
G_pca$eigenvalues
head(G_pca$pca.scores)
G_pca$plot.pca
G_pca$plot.scree

# PCA plot by family (17 groups).
grp <- as.factor(pheno.apple$Family)
G_pca_grp <- kinship.pca(K = G, groups = grp, label = FALSE, ellipses = FALSE)
G_pca_grp$plot.pca


ASRgenomics documentation built on May 29, 2024, 12:03 p.m.