| kBET | R Documentation |
kBET runs a chi square test to evaluate
the probability of a batch effect.
kBET(df, batch, k0 = NULL, knn = NULL, testSize = NULL,
do.pca = TRUE, dim.pca = 50, heuristic = TRUE, n_repeat = 100,
alpha = 0.05, addTest = FALSE, verbose = FALSE, plot = TRUE,
adapt = TRUE)
df |
dataset (rows: cells, columns: features) |
batch |
batch id for each cell or a data frame with both condition and replicates |
k0 |
number of nearest neighbours to test on (neighbourhood size) |
knn |
an n x k matrix of nearest neighbours for each cell (optional) |
testSize |
number of data points to test, (10 percent sample size default, but at least 25) |
do.pca |
perform a pca prior to knn search? (defaults to TRUE) |
dim.pca |
if do.pca=TRUE, choose the number of dimensions to consider (defaults to 50) |
heuristic |
compute an optimal neighbourhood size k (defaults to TRUE) |
n_repeat |
to create a statistics on batch estimates, evaluate 'n_repeat' subsets |
alpha |
significance level |
addTest |
perform an LRT-approximation to the multinomial test AND a multinomial exact test (if appropriate) |
verbose |
displays stages of current computation (defaults to FALSE) |
plot |
if stats > 10, then a boxplot of the resulting rejection rates is created |
adapt |
In some cases, a number of cells do not contribute to any neighbourhood and this may cause an imbalance in observed and expected batch label frequencies. Frequencies will be adapted if adapt=TRUE (default). |
list object
summary - a rejection rate for the data,
an expected rejection rate for random
labeling and the significance for the observed result
results - detailed list for each tested cells;
p-values for expected and observed label distribution
average.pval - significance level over the averaged
batch label distribution in all neighbourhoods
stats - extended test summary for every sample
params - list of input parameters and adapted parameters,
respectively
outsider - only shown if adapt=TRUE. List of samples
without mutual nearest neighbour:
index - index of each outsider sample)
categories - tabularised labels of outsiders
p.val - Significance level of outsider batch label distribution
vs expected frequencies.
If the significance level is lower than alpha,
expected frequencies will be adapted
If the optimal neighbourhood size (k0) is smaller than 10, NA is returned.
batch <- rep(seq_len(10),each=20)
data <- matrix(rpois(n = 50000, lambda = 10)*rbinom(50000,1,prob=0.5), nrow=200)
batch.estimate <- kBET(data,batch)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.