findClusters | R Documentation |
Given a table of gene positions that has a score column, genes will first be sorted into positional order and consecutive windows of high or low scores will be reported.
findClusters(stats, score.col = NULL, w.size = NULL, n.med = NULL, n.consec = NULL,
cut.samps = NULL, maxFDR = 0.05, trend = c("down", "up"), n.perm = 100,
getFDRs = FALSE, verbose = TRUE)
stats |
A |
score.col |
A number that gives the column in |
w.size |
The number of consecutive genes to consider windows over. Must be odd. |
n.med |
Minimum number of genes in a window, that have median score centred around them above a cutoff. |
n.consec |
Minimum cluster size. |
cut.samps |
A vector of score cutoffs to calculate the FDR at. |
maxFDR |
The highest FDR level still deemed to be significant. |
trend |
Whether the clusters must have all positive scores (enrichment), or all negative scores (depletion). |
n.perm |
How many random tables to generate to use in the FDR calculations. |
getFDRs |
If TRUE, will also return the table of FDRs at a variety of score cutoffs, from which the score cutoff for calling clusters was chosen. |
verbose |
Whether to print progress of computations. |
First, the median over a window of size w.size
is calculated in a rolling
window and then associated with the middle gene of the window. Windows are again
run over the genes, and the gene at the centre of the window is significant if
there are also at least n.med
genes with representative medians
above the score cutoff, in the window that surrounds it. These marker genes
are extended outwards, for as long as the score has the same sign. The
order of the stats
rows is randomised, and this process in done for
every randomisation.
The procedure for calling clusters is done at a range of score cutoffs.
The first score cutoff to give an FDR below maxFDR
is chosen as the
cutoff to use, and clusters are then called based on this cutoff.
If getFDRs
is FALSE, then only the stats
table, with an
additional column, cluster
. If getFDRs
is TRUE, then a list with
elements :
table |
The table |
FDR |
The table of score cutoffs tried, and their FDRs. |
Dario Strbenac, Aaron Statham
Saul Bert, in preparation
chrs <- sample(paste("chr", c(1:5), sep = ""), 500, replace = TRUE)
starts <- sample(1:10000000, 500, replace = TRUE)
ends <- starts + 10000
genes <- data.frame(chr = chrs, start = starts, end = ends, strand = '+')
genes <- genes[order(genes$chr, genes$start), ]
genes$t.stat = rnorm(500, 0, 2)
genes$t.stat[21:30] = rnorm(10, 4, 1)
findClusters(genes, 5, 5, 2, 3, seq(1, 10, 1), trend = "up", n.perm = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.