computeSpikeFactors: Normalization with spike-in counts

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Compute size factors based on the coverage of spike-in transcripts.

Usage

1
2
## S4 method for signature 'SCESet'
computeSpikeFactors(x, type=NULL, sf.out=FALSE, general.use=TRUE)

Arguments

x

A SCESet object containing rows corresponding spike-in transcripts.

type

A character vector specifying which spike-in sets to use.

sf.out

A logical scalar indicating whether only size factors should be returned.

general.use

A logical scalar indicating whether the size factors should be stored for general use by all genes.

Details

The size factor for each cell is defined as the sum of all spike-in counts in each cell. This is equivalent to normalizing to equalize spike-in coverage between cells. Size factors are scaled so that the mean of all size factors is unity, for standardization purposes if one were to compare different sets of size factors.

Spike-in counts are assumed to be stored in the rows specified by isSpike(x). This specification should have been performed by supplying the names of the spike-in sets – see ?setSpike for more details. By default, if multiple spike-in sets are available, all of them will be used to compute the size factors. The function can be restricted to a subset of the spike-ins by specifying the names of the desired spike-in sets in type.

By default, the function will store several copies of the same size factors in the output object. One copy will be stored in sizeFactors(x) for normalization of all genes – this can be disabled by setting general.use=FALSE. One copy will also be stored in sizeFactors(x, type=s), where s is the name of a spike-in set in type. (If type=NULL, a copy is stored for every spike-in set, as all of them would be used to compute the size factors.) Separate storage allows spike-in-specific normalization in normalize,SCESet-method.

Value

If sf.out=TRUE, a numeric vector of size factors is returned directly.

Otherwise, an object of class x is returned, containing size factors for all cells. A copy of the vector is stored for each spike-in set that was used to compute the size factors. If general.use=TRUE, a copy is also stored for use by non-spike-in genes.

Author(s)

Aaron Lun

References

Lun ATL, McCarthy DJ and Marioni JC (2016). A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res. 5:2122

See Also

isSpike, setSpike

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
################
# Mocking up some data.
set.seed(100)
ncells <- 200

nspikes <- 100
spike.means <- 2^runif(nspikes, 3, 8)
spike.disp <- 100/spike.means + 0.5
spike.data <- matrix(rnbinom(nspikes*ncells, mu=spike.means, size=1/spike.disp), ncol=ncells)

ngenes <- 2000
cell.means <- 2^runif(ngenes, 2, 10)
cell.disp <- 100/cell.means + 0.5
cell.data <- matrix(rnbinom(ngenes*ncells, mu=cell.means, size=1/cell.disp), ncol=ncells)

combined <- rbind(cell.data, spike.data)
colnames(combined) <- seq_len(ncells)
rownames(combined) <- seq_len(nrow(combined))
y <- newSCESet(countData=combined)
y <- calculateQCMetrics(y, list(Spike=rep(c(FALSE, TRUE), c(ngenes, nspikes))))
setSpike(y) <- "Spike"

################
# Computing and storing spike-in size factors. 
y2 <- computeSpikeFactors(y)
head(sizeFactors(y2))
head(sizeFactors(y2, type="Spike"))

# general.use=FALSE does not modify general size factors
sizeFactors(y2) <- 1
sizeFactors(y2, type="Spike") <- 1
y2 <- computeSpikeFactors(y2, general.use=FALSE)
head(sizeFactors(y2))
head(sizeFactors(y2, type="Spike"))

scran documentation built on May 31, 2017, 2:28 p.m.

Search within the scran package
Search all R packages, documentation and source code