plotAberrantPerSample | R Documentation |
The OUTRIDER package provides mutliple functions to visualize the data and the results of a full data set analysis.
This is the list of all plotting function provided by OUTRIDER:
plotAberrantPerSample()
plotVolcano()
plotExpressionRank()
plotQQ()
plotExpectedVsObservedCounts()
plotCountCorHeatmap()
plotCountGeneSampleHeatmap()
plotSizeFactors()
plotFPKM()
plotExpressedGenes()
plotDispEsts()
plotPowerAnalysis()
plotEncDimSearch()
For a detailed description of each plot function please see the details. Most of the functions share the same parameters.
plotAberrantPerSample(object, ...)
plotCountCorHeatmap(object, ...)
plotManhattan(object, ...)
plotEncDimSearch(object, ...)
plotQQ(object, ...)
plotVolcano(object, ...)
## S4 method for signature 'OutriderDataSet'
plotVolcano(
object,
sampleID,
main,
padjCutoff = 0.05,
zScoreCutoff = 0,
label = "aberrant",
xaxis = c("zscore", "log2fc", "fc"),
pch = 16,
basePlot = FALSE,
col = c("gray", "firebrick"),
subsetName = NULL
)
## S4 method for signature 'OutriderDataSet'
plotQQ(
object,
geneID,
main,
global = FALSE,
padjCutoff = 0.05,
zScoreCutoff = 0,
samplePoints = TRUE,
legendPos = "topleft",
outlierRatio = 0.001,
conf.alpha = 0.05,
subsetName = NULL,
pch = 16,
xlim = NULL,
ylim = NULL,
col = NULL
)
plotExpectedVsObservedCounts(
ods,
geneID,
main,
basePlot = FALSE,
log = TRUE,
groups = c(),
groupColSet = "Set1",
label = "aberrant",
subsetName = NULL,
...
)
plotExpressionRank(
ods,
geneID,
main,
padjCutoff = 0.05,
zScoreCutoff = 0,
normalized = TRUE,
basePlot = FALSE,
log = TRUE,
col = c("gray", "firebrick"),
groups = c(),
groupColSet = "Accent",
label = "aberrant",
subsetName = NULL
)
## S4 method for signature 'OutriderDataSet'
plotCountCorHeatmap(
object,
normalized = TRUE,
rowCentered = TRUE,
rowGroups = NA,
rowColSet = NA,
colGroups = NA,
colColSet = NA,
nRowCluster = 4,
nColCluster = 4,
main = "Count correlation heatmap",
basePlot = TRUE,
nBreaks = 50,
show_names = c("none", "row", "col", "both"),
...
)
plotCountGeneSampleHeatmap(
ods,
normalized = TRUE,
rowCentered = TRUE,
rowGroups = NA,
rowColSet = NA,
colGroups = NA,
colColSet = NA,
nRowCluster = 4,
nColCluster = 4,
main = "Count Gene vs Sample Heatmap",
bcvQuantile = 0.9,
show_names = c("none", "col", "row", "both"),
nGenes = 500,
nBreaks = 50,
...
)
## S4 method for signature 'OutriderDataSet'
plotAberrantPerSample(
object,
main = "Aberrant Genes per Sample",
outlierRatio = 0.001,
col = "Dark2",
yadjust = 1.2,
ylab = "Aberrantly expressed genes",
subsetName = NULL,
...
)
plotFPKM(ods, bins = 100)
## S4 method for signature 'OutriderDataSet'
plotDispEsts(
object,
compareDisp,
xlim,
ylim,
main = "Dispersion estimates versus mean expression",
...
)
plotPowerAnalysis(ods)
## S4 method for signature 'OutriderDataSet'
plotEncDimSearch(object)
plotExpressedGenes(ods, main = "Statistics of expressed genes")
plotSizeFactors(ods, basePlot = TRUE)
## S4 method for signature 'OutriderDataSet'
plotManhattan(
object,
sampleID,
value = "pvalue",
chr = NULL,
main = paste0("Sample: ", sampleID),
featureRanges = rowRanges(object),
subsetName = NULL,
chrColor = c("black", "darkgrey")
)
... |
Additional parameters passed to plot() or plot_ly() if not stated otherwise in the details for each plot function |
sampleID , geneID |
A sample or gene ID, which should be plotted. Can also be a vector. Integers are treated as indices. |
main |
Title for the plot, if missing a default title will be used. |
padjCutoff , zScoreCutoff |
Significance or Z-score cutoff to mark outliers |
label |
Indicates which genes or samples should be labeled. By default all aberrant genes/samples are labelled. Can be set to NULL for no labels. Provide a vector of geneIDs/sampleIDs to label specific genes/samples. |
xaxis |
Indicates which assay should be shown on the x-axis of the volcano plot. Defaults to 'zscore'. Other options are 'fc' and 'log2fc' for the fold-change or log2 fold-change. |
pch |
Integer or character to be used for plotting the points |
basePlot |
if |
col |
Set color for the points. If set, it must be a character vector
of length 2. (1. normal point; 2. outlier point) or a single
character referring to a color palette from |
subsetName |
The name of a subset of genes of interest for which FDR corrected pvalues were previously computed. Those FDR values on the subset will then be used to determine aberrant status. Default is NULL (using transcriptome-wide FDR corrected pvalues). |
global |
Flag to plot a global Q-Q plot, default FALSE |
samplePoints |
Sample points for Q-Q plot, defaults to max 30k points |
legendPos |
Set legendpos, by default topleft. |
outlierRatio |
The fraction to be used for the outlier sample filtering |
conf.alpha |
If set, a confidence interval is plotted, defaults to 0.05 |
xlim , ylim |
The x/y limits for the plot or NULL to use the full data range |
ods , object |
An OutriderDataSet object. |
log |
If TRUE, the default, counts are plotted in log10. |
groups |
A character vector containing either group assignments of samples or sample IDs. Is empty by default. If group assignments are given, the vector must have the same length as the number of samples. If sample IDs are provided the assignment will result in a binary group assignemt. |
groupColSet |
A color set from RColorBrewer or a manual vector of colors, which length must match the number of categories from groups. |
normalized |
If TRUE, the normalized counts are used, the default, otherwise the raw counts |
rowCentered |
If TRUE, the counts are row-wise (gene-wise) centered |
rowGroups , colGroups |
A vector of co-factors (colnames of colData) for color coding the rows. It also accepts a data.frame of dim = (#samples, #groups). Must have more than 2 groups. |
rowColSet , colColSet |
A color set from RColorBrewer/colorRampPalette |
nRowCluster , nColCluster |
Number of clusters to show in the row and column dendrograms. If this argument is set the resulting cluster assignments are added to the OutriderDataSet. |
nBreaks |
number of breaks for the heatmap color scheme. Default to 50. |
show_names |
character string indicating whether to show 'none', 'row', 'col', or 'both' names on the heatmap axes. |
bcvQuantile |
quantile for choosing the cutoff for the biological coefficient of variation (BCV) |
nGenes |
upper limit of number of genes (defaults to 500). Subsets the top n genes based on the BCV. |
yadjust |
Option to adjust position of Median and 90 percentile labels. |
ylab |
The y axis label |
bins |
Number of bins used in the histogram. Defaults to 100. |
compareDisp |
If TRUE, the default, and if the autoCorrect normalization was used it computes the dispersion without autoCorrect and plots it for comparison. |
value |
Indicates which assay is shown in the manhattan plot. Defaults to 'pvalue'. Other options are 'zScore' and 'log2fc'. |
chr |
The chromosomes to be displayed in the |
featureRanges |
A GRanges object of the same length as the OutriderDataSet object that contains the genomic positions of features that are shown in the manhattan plot. |
chrColor |
A vector of length 2 giving the two colors used for coloring alternating chromosomes in the manhattan plot. Default colors are 'black' and 'darkgrey'. |
plotAberrantPerSample
: The number of aberrant events per sample are
plotted sorted by rank. The ... parameters are passed on to the
aberrant
function.
plotVolcano
: the volcano plot is sample-centric. It plots for a given
sample the negative log10 nominal P-values against the Z-scores for all
genes.
plotExpressionRank
: This function plots for a given gene the
expression level against the expression rank for all samples. This can
be used with normalized and unnormalized expression values.
plotQQ
: the quantile-quantile plot for a given gene or if
global
is set to TRUE
over the full data set. Here the
observed P-values are plotted against the expected ones in the negative
log10 space.
plotExpectedVsObservedCounts
: A scatter plot of the observed counts
against the predicted expression for a given gene.
plotCountCorHeatmap
: The correlation heatmap of the count data
of the full data set. Default the values are log transformed and
row centered. This function returns an OutriderDataSet with annotated
clusters if requested. The ... arguments are passed to the
pheatmap
function.
plotCountGeneSampleHeatmap
: A gene x sample heatmap of the raw or
normalized counts. By default they are log transformed and row centered.
Only the top 500 viable genes based on the BCV (biological coefficient
of variation) is used by default.
plotSizeFactors
: The sizefactor distribution within the dataset.
plotFPKM
: The distribution of FPKM values. If the OutriderDataSet
object contains the passedFilter
column, it will plot both FPKM
distributions for the expressed genes and for the filtered genes.
plotExpressedGenes
: A summary statistic plot on the number of genes
expressed within this dataset. It plots the sample rank (based on the
number of expressed genes) against the accumulated statistics up to the
given sample.
plotDispEsts
: Plots the dispersion of the OutriderDataSet
model against the normalized mean count. If autoCorrect is used it will also
estimate the dispersion without normalization for comparison.
plotPowerAnalysis
: The power analysis plot should give the user a
ruff estimate of the events one can be detected with OUTRIDER. Based on
the dispersion of the provided OUTRIDER data set the theoretical P-value
over the mean expression is plotted. This is done for different expression
levels. The curves are smooths to make the reading of the plot easier.
plotEncDimSearch
: Visualization of the hyperparameter optimization.
It plots the encoding dimension against the achieved loss (area under the
precision-recall curve). From this plot the optimum should be choosen for
the q
in fitting process.
plotManhattan
: Visualizes different metrics for each gene (pvalue,
log2 fold-change, z-score) along with the genomic coordinates of the
respective gene as a manhattan plot. Detected outlier genes are highlighted
in red.
If base R graphics are used nothing is returned else the plotly or the gplot object is returned.
ods <- makeExampleOutriderDataSet(dataset="Kremer")
implementation <- 'autoencoder'
mcols(ods)$basepairs <- 300 # assign pseudo gene length for filtering
ods <- filterExpression(ods)
# restrict FDR correction to set of genes of interest per sample
genesOfInterest <- list(MUC1372 = c("ATPIF1", "UROD", "YBX1",
sample(rownames(ods), 25)),
MUC1360 = sample(rownames(ods), 50),
MUC1350 = sample(rownames(ods), 75),
X76619 = sample(rownames(ods), 20),
X74172 = sample(rownames(ods), 150))
ods <- OUTRIDER(ods, implementation=implementation, subsets=list("exampleGenes"=genesOfInterest))
plotAberrantPerSample(ods)
plotAberrantPerSample(ods, subsetName="exampleGenes")
plotVolcano(ods, 49)
plotVolcano(ods, 'MUC1365', basePlot=TRUE)
plotVolcano(ods, 'MUC1351', basePlot=TRUE, xaxis="log2fc", label=c("NBPF16"))
plotVolcano(ods, 'MUC1372', basePlot=TRUE, subsetName="exampleGenes")
plotExpressionRank(ods, 35)
plotExpressionRank(ods, 35, subsetName="exampleGenes")
plotExpressionRank(ods, "NDUFS5", normalized=FALSE, label="MUC1372",
log=FALSE, main="Over expression outlier", basePlot=TRUE)
plotQQ(ods, 149)
plotQQ(ods, 149, subsetName="exampleGenes")
plotQQ(ods, global=TRUE, outlierRatio=0.001)
plotExpectedVsObservedCounts(ods, 149)
plotExpectedVsObservedCounts(ods, "ATAD3C", basePlot=TRUE)
plotExpectedVsObservedCounts(ods, "UROD", subsetName="exampleGenes")
plotExpressedGenes(ods)
sex <- sample(c("female", "male"), dim(ods)[2], replace=TRUE)
colData(ods)$Sex <- sex
ods <- plotCountCorHeatmap(ods, nColCluster=4, normalized=FALSE)
ods <- plotCountCorHeatmap(ods, colGroup="Sex", colColSet="Set1")
table(colData(ods)$clusterNumber_4)
plotCountGeneSampleHeatmap(ods, normalized=FALSE)
plotCountGeneSampleHeatmap(ods, rowGroups="theta",
rowColSet=list(c("white", "darkgreen")))
plotSizeFactors(ods)
mcols(ods)$basepairs <- 1
mcols(ods)$passedFilter <- rowMeans(counts(ods)) > 10
plotFPKM(ods)
plotDispEsts(ods, compareDisp=FALSE)
plotPowerAnalysis(ods)
## Not run:
# for speed reasons we only search for 5 different dimensions
ods <- findEncodingDim(ods, params=c(3, 10, 20, 35, 50),
implementation=implementation)
plotEncDimSearch(ods)
## End(Not run)
# To show the pvalues of a sample in a manhattan plot, rowRanges(ods) must
# contain the genomic position of each feature or a GRanges object must
# be provided
## Not run:
# in case rowRanges(ods) is a GRangesList, run this first once to speed up:
rowRanges(ods) <- unlist(endoapply(rowRanges(ods), range))
## End(Not run)
gr <- GRanges(
seqnames=sample(paste0("chr", 1:22), nrow(ods), replace=TRUE),
ranges=IRanges(start=runif(nrow(ods), min=0, max=1e5), width=100))
plotManhattan(ods, "MUC1350", value="pvalue", featureRanges=gr)
plotManhattan(ods, "MUC1350", value="l2fc", featureRanges=gr)
plotManhattan(ods, "MUC1372", featureRanges=gr, subsetName="exampleGenes")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.