skIncrPPCA: optionally fault tolerant incremental partial PCA for...

Description Usage Arguments Value Note Examples

Description

optionally fault tolerant incremental partial PCA for projection of samples from SummarizedExperiment

Usage

1
2
3
4
5
6
7
8
9
skIncrPPCA(
  se,
  chunksize,
  n_components,
  assayind = 1,
  picklePath = "./skIdump.pkl",
  matTx = force,
  ...
)

Arguments

se

instance of SummarizedExperiment

chunksize

integer number of samples per step

n_components

integer number of PCs to compute

assayind

not used, assumed set to 1

picklePath

if non-null, incremental results saved here via sklearn.externals.joblib.dump, for each chunk. If NULL, no saving of incremental results.

matTx

a function defaulting to force() that accepts a matrix and returns a matrix with identical dimensions, e.g., function(x) log(x+1)

...

not used

Value

python instance of sklearn.decomposition.incremental_pca.IncrementalPCA

Note

Will treat samples as records and all features (rows) as attributes, projecting. to an n_components-dimensional space. Method will acquire chunk of assay data and transpose before computing PCA contributions. In case of crash, restore from picklePath using SklearnEls()$joblib$load after loading reticulate. You can use the n_samples_seen_ component of the restored python reference to determine where to restart. You can manage resumption using skPartialPCA_step.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
# demo SE made with TENxGenomics:
# mm = matrixSummarizedExperiment(h5path, 1:27998, 1:750)
# saveHDF5SummarizedExperiment(mm, "tenx_750")
#
if (requireNamespace("HDF5Array")) {
  se750 = HDF5Array::loadHDF5SummarizedExperiment(
     system.file("hdf5/tenx_750", package="BiocSklearn"))
  lit = skIncrPPCA(se750[, 1:50], chunksize=5, n_components=4)
  round(cor(pypc <- lit$transform(dat <- t(as.matrix(assay(se750[,1:50]))))),3)
  rpc = prcomp(dat)
  round(cor(rpc$x[,1:4], pypc), 3)
}

## End(Not run) # this has to be made basilisk-compliant

BiocSklearn documentation built on Nov. 8, 2020, 7:52 p.m.