predict_gt_pca: Predict scores of a PCA

predict.gt_pcaR Documentation

Predict scores of a PCA

Description

Predict the PCA scores for a gt_pca, either for the original data or projecting new data.

Usage

## S3 method for class 'gt_pca'
predict(
  object,
  new_data = NULL,
  project_method = c("none", "simple", "OADP", "least_squares"),
  lsq_pcs = c(1, 2),
  block_size = NULL,
  n_cores = 1,
  as_matrix = TRUE,
  ...
)

Arguments

object

the gt_pca object

new_data

a gen_tibble if scores are requested for a new dataset

project_method

a string taking the value of either "simple", "OADP" (Online Augmentation, Decomposition, and Procrustes (OADP) projection), or "least_squares" (as done by SMARTPCA)

lsq_pcs

a vector of length two with the values of the two principal components to use for the least square fitting. Only relevant ifproject_method = 'least_squares'

block_size

number of loci read simultaneously (larger values will speed up computation, but require more memory)

n_cores

number of cores

as_matrix

logical, whether to return the result as a matrix (default) or a tibble.

...

no used

Value

a matrix of predictions (in line with predict using a prcomp object) or a tibble, with samples as rows and components as columns. The number of components depends on how many were estimated in the gt_pca object.

References

Zhang et al (2020). Fast and robust ancestry prediction using principal component analysis 36(11): 3439–3446.

Examples



# Create a gen_tibble of lobster genotypes
bed_file <-
  system.file("extdata", "lobster", "lobster.bed", package = "tidypopgen")
lobsters <- gen_tibble(bed_file,
  backingfile = tempfile("lobsters"),
  quiet = TRUE
)

# Remove monomorphic loci and impute
lobsters <- lobsters %>% select_loci_if(loci_maf(genotypes) > 0)
lobsters <- gt_impute_simple(lobsters, method = "mode")

# Subset into two datasets: one original and one to predict
original_lobsters <- lobsters[c(1:150), ]
new_lobsters <- lobsters[c(151:176), ]

# Create PCA object
pca <- gt_pca_partialSVD(original_lobsters)

# Predict
predict(pca, new_data = new_lobsters, project_method = "simple")

# Predict with OADP
predict(pca, new_data = new_lobsters, project_method = "OADP")

# Predict with least squares
predict(pca,
  new_data = new_lobsters,
  project_method = "least_squares", lsq_pcs = c(1, 2)
)

# Return a tibble
predict(pca, new_data = new_lobsters, as_matrix = FALSE)

# Adjust block.size
predict(pca, new_data = new_lobsters, block_size = 10)


tidypopgen documentation built on Aug. 28, 2025, 1:08 a.m.