generateSubsampledMatrix: Generate the read matrix corresponding to a particular level

Description Usage Arguments Details Value Examples

View source: R/generateSubsampledMatrix.R

Description

Generate a subsampled matrix from an original count matrix. This can be used to perform read subsampling analyses, (though generally the subsample function is recommended).

It is also useful for reproducing the results of an earlier run (see Details).

Usage

1
generateSubsampledMatrix(counts, proportion, seed, replication = 1)

Arguments

counts

Original matrix of read counts

proportion

The specific proportion to subsample

seed

A subsampling seed, which can be extracted from a subsamples or summary.subsamples object. If not given, doesn't set the seed.

replication

Replicate number: allows performing multiple deterministic replications at a given subsampling proportion

Details

A subsamples object, or a summary.subsamples object, does not contain the subsampled count matrix at each depth (as it would take too much space and is rarely used). However, as it saves the random seed used to generate the count matrix, the count matrix at any depth can be retrieved. This can be done for a subsamples object ss by retrieving the seed with getSeed(ss). When given along with the original counts, the proportion, and the replication number (if more than one subsampling was done at each proportion) this produces the same matrix as was used in the analysis.

The seed is calculated deterministically using an md5 hash of three combined values: the global seed used for the subsampling object, the subsampling proportion, and the replication # for that proportion.

Value

subsamples matrix at specified subsampling proportion

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data(hammer)

hammer.counts = Biobase::exprs(hammer)[, 1:4]
hammer.design = Biobase::pData(hammer)[1:4, ]
hammer.counts = hammer.counts[rowSums(hammer.counts) >= 5, ]

ss = subsample(hammer.counts, c(.01, .1, 1), treatment=hammer.design$protocol,
                 method=c("edgeR", "DESeq2", "voomLimma"))

seed = getSeed(ss)

# generate the matrices used at each subsample
subm.01 = generateSubsampledMatrix(hammer.counts, .01, seed)
subm.1 = generateSubsampledMatrix(hammer.counts, .1, seed)

subSeq documentation built on Nov. 8, 2020, 5:45 p.m.