impute_ppca: Impute missing values using Probabilistic PCA

View source: R/imputes.R

impute_ppcaR Documentation

Impute missing values using Probabilistic PCA

Description

One of several PCA-based imputation methods. Basically a wrapper around ⁠pcaMethods::⁠pca(method = "ppca"). For a detailed discussion, see the vignette("pcaMethods") and vignette("missingValues", "pcaMethods") as well as the References section.
In the underlying function (⁠pcaMethods::⁠pca(method = "ppca")), the order of columns has an influence on the outcome. Therefore, calling ⁠pcaMethods::⁠pca(method = "ppca") on a matrix and calling metamorphr::impute() on a tidy tibble might give different results, even though they contain the same data. That is because under the hood, the tibble is transformed to a matrix prior to calling ⁠pcaMethods::⁠pca(method = "ppca") and you have limited influence on the column order of the resulting matrix.

Important Note

impute_ppca() depends on the pcaMethods package from Bioconductor. If metamorphr was installed via install.packages(), dependencies from Bioconductor were not automatically installed. When impute_ppca() is called without the pcaMethods package installed, you should be asked if you want to install pak and pcaMethods. If you want to use impute_ppca() you have to install those. In case you run into trouble with the automatic installation, please install pcaMethods manually. See pcaMethods – a Bioconductor package providing PCA methods for incomplete data for instructions on manual installation.

Usage

impute_ppca(
  data,
  n_pcs = 2,
  center = TRUE,
  scale = "none",
  direction = 2,
  random_seed = 1L
)

Arguments

data

A tidy tibble created by read_featuretable.

n_pcs

The number of PCs to calculate.

center

Should data be mean centered? See prep for details.

scale

Should data be scaled? See prep for details.

direction

Either 1 or 2. 1 runs a PCA on a matrix with samples in columns and features in rows and 2 runs a PCA on a matrix with features in columns and samples in rows. Both are valid according to this discussion on GitHub but give different results.

random_seed

An integer used as seed for the random number generator.

Value

A tibble with imputed missing values.

References

  • H. R. Wolfram Stacklies, 2017, DOI 10.18129/B9.BIOC.PCAMETHODS.

  • W. Stacklies, H. Redestig, M. Scholz, D. Walther, J. Selbig, Bioinformatics 2007, 23, 1164–1167, DOI 10.1093/bioinformatics/btm069.

Examples

toy_metaboscape %>%
  impute_ppca()

metamorphr documentation built on June 10, 2026, 5:07 p.m.