skIncrPCA_h5: demo of HDF5 processing with incremental...

Description Usage Arguments Note Examples

View source: R/skIPart.R

Description

demo of HDF5 processing with incremental PCA/batch_size/fit_transform

Usage

1
skIncrPCA_h5(fn, dsname = "assay001", n_components, chunk.size = 10L)

Arguments

fn

character(1) path to HDF5 file

dsname

character(1) name of dataset within HDF5 file, assumed to be 2-dimensional array

n_components

numeric(1) passed to IncrementalPCA

chunk.size

numeric(1) passed to IncrementalPCA as batch_size

Note

Here we use IncrementalPCA$fit_transform and let python take care of chunk retrieval. skIncrPartialPCA acquires chunks from R matrix and uses IncrementalPCA$partial_fit.

Examples

1
2
3
4
5
6
if (interactive()) {
 fn = system.file("hdf5/irmatt.h5", package="BiocSklearn") # 'transposed' relative to R iris
 dem = skIncrPCA_h5(fn, n_components=3L, dsname="tquants")
 dem
 head(getTransformed(dem))
}

Example output

Loading required package: reticulate
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package:MatrixGenericsThe following objects are masked frompackage:matrixStats:

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package:BiobaseThe following object is masked frompackage:MatrixGenerics:

    rowMedians

The following objects are masked frompackage:matrixStats:

    anyMissing, rowMedians

Loading required package: knitr

BiocSklearn documentation built on Nov. 8, 2020, 7:52 p.m.