Description Usage Arguments Details Value Note Author(s) References See Also Examples
Performs recursive quantilizations on gene expression data across
samples, to quantileDiscretize gene expression matrix. The quantile parameter
q
determines the estimated proportion of differentially
expressed genes (2q as for both up- and down-regulatons). The
rank parameter r
determines how many discrete levels should
differentially expressed genes (or outliers) have. See details below.
1 |
x |
It can be an object of the |
... |
Currently, the ... accepts two parameter: |
qEstimated proportion of conditions where gene is up- or
down-regulated, value between (0,0.5), default value is set to 0.06. By specifying q
one
estimates that in 2q
of all conditions, the expression value
of a gene is considered as outlier.
rankRanks (levels) of outliers, a positive integer, default is 1L. By default, all conditions get one label for each gene in {-1, 0, 1}, representing down expression, not changing and high expression respectively. In case rank>1, the outliers are further divided into rank levels by applying recursive quantilization with equal intervals.
Parameter q
corresponds to the command line option -q
in the QUBIC command line tool, and the rank
option corresponds to
-r
.
For each gene, the algorithm applies quantile discretization first to divide conditions into negative (lower), un-changed and positive (higher) expressions. Negative and positive expressed conditions are considered as outliers. For outliers in each direction, the algorithm tries to further quantileDiscretize the expression values in case rank>1.
This second discretization step is performed by dividing the sorted outliers into rank tandom groups with equal conditions. A label is assigned to each of these tandom groups, in the following order:
-1, -2, …, -rank
for outliers with negative expression, from the most negative group to the least negative group (not the other way around!).
Similarly, for positive outliers, labels in the order of
rank, rank-1, …, 1
are assigned to tandom groups from the least positive group to the most positive group.
That is, signs of labels indicate the direction of gene expression change, and the absolute value represents the quantileDiscretized rank in the outliers.
An object of the same class as the input parameter, with the
exprs
slot replaced by the quantileDiscretized matrix, which is a
matrix of integer.
Note that the resulting discrete matrix of this implementation can be slighly different from the one used by the QUBIC command line tool.
The main reason for this is the internal data type: while QUBIC
uses float
to represent expression matrix, we use double
to represent the matrix.
It has the advantages of interfacing to R, having higher precision and avoiding errors caused by floating presentation. It is implemented with potential larger costs of memory, however for test data sets (for example the ALL dataset with more than 120 samples and 12000 genes) the peak memory use (<100M) as well as the execution time (CPU time 0.028s) are well under control.
The differentially is especially often observed when there are many tied values. These cases however are very rare cases and we assume they will not affect the results to a large extent.
Jitao David Zhang <jitao_david.zhang@roche.com>
Li et al. (2009) QUBIC: a qualitative biclustering algorithm for analyses of gene expression data Nucleic Acids Research 37:e101
parseQubicChars
parses the quantileDiscretized matrix by the
QUBIC command line tool into a data frame.
1 2 3 4 5 6 7 8 9 10 11 12 13 | library(Biobase)
data(sample.ExpressionSet, package="Biobase")
sample.disc <- quantileDiscretize(sample.ExpressionSet)
exprs(sample.disc)[1:6, 1:6]
## Equivalent to pass a numeric matrix
sample.mat.disc <- quantileDiscretize(exprs(sample.ExpressionSet))
sample.mat.disc[1:6, 1:6]
## Not run: identical(exprs(sample.disc),sample.mat.disc)
## with multiple ranks
sample.rank3 <- quantileDiscretize(sample.ExpressionSet, rank=3)
exprs(sample.rank3)[1:6, 1:6]
|
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colMeans, colSums, colnames,
dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
intersect, is.unsorted, lapply, lengths, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which, which.max, which.min
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
A B C D E F
AFFX-MurIL2_at 1 0 1 1 0 0
AFFX-MurIL10_at 0 1 0 0 -1 0
AFFX-MurIL4_at 1 -1 1 0 -1 0
AFFX-MurFAS_at 1 -1 0 0 1 0
AFFX-BioB-5_at 1 0 0 1 0 0
AFFX-BioB-M_at 1 -1 0 0 0 0
A B C D E F
AFFX-MurIL2_at 1 0 1 1 0 0
AFFX-MurIL10_at 0 1 0 0 -1 0
AFFX-MurIL4_at 1 -1 1 0 -1 0
AFFX-MurFAS_at 1 -1 0 0 1 0
AFFX-BioB-5_at 1 0 0 1 0 0
AFFX-BioB-M_at 1 -1 0 0 0 0
A B C D E F
AFFX-MurIL2_at 1 0 1 3 0 0
AFFX-MurIL10_at 0 3 0 0 -3 0
AFFX-MurIL4_at 1 -3 2 0 -1 0
AFFX-MurFAS_at 3 -3 0 0 1 0
AFFX-BioB-5_at 1 0 0 3 0 0
AFFX-BioB-M_at 3 -3 0 0 0 0
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.