kBET | R Documentation |
kBET
runs a chi square test to evaluate
the probability of a batch effect.
kBET(df, batch, k0 = NULL, knn = NULL, testSize = NULL,
do.pca = TRUE, dim.pca = 50, heuristic = TRUE, n_repeat = 100,
alpha = 0.05, addTest = FALSE, verbose = FALSE, plot = TRUE,
adapt = TRUE)
df |
dataset (rows: cells, columns: features) |
batch |
batch id for each cell or a data frame with both condition and replicates |
k0 |
number of nearest neighbours to test on (neighbourhood size) |
knn |
an n x k matrix of nearest neighbours for each cell (optional) |
testSize |
number of data points to test, (10 percent sample size default, but at least 25) |
do.pca |
perform a pca prior to knn search? (defaults to TRUE) |
dim.pca |
if do.pca=TRUE, choose the number of dimensions to consider (defaults to 50) |
heuristic |
compute an optimal neighbourhood size k (defaults to TRUE) |
n_repeat |
to create a statistics on batch estimates, evaluate 'n_repeat' subsets |
alpha |
significance level |
addTest |
perform an LRT-approximation to the multinomial test AND a multinomial exact test (if appropriate) |
verbose |
displays stages of current computation (defaults to FALSE) |
plot |
if stats > 10, then a boxplot of the resulting rejection rates is created |
adapt |
In some cases, a number of cells do not contribute to any neighbourhood and this may cause an imbalance in observed and expected batch label frequencies. Frequencies will be adapted if adapt=TRUE (default). |
list object
summary
- a rejection rate for the data,
an expected rejection rate for random
labeling and the significance for the observed result
results
- detailed list for each tested cells;
p-values for expected and observed label distribution
average.pval
- significance level over the averaged
batch label distribution in all neighbourhoods
stats
- extended test summary for every sample
params
- list of input parameters and adapted parameters,
respectively
outsider
- only shown if adapt=TRUE
. List of samples
without mutual nearest neighbour:
index
- index of each outsider sample)
categories
- tabularised labels of outsiders
p.val
- Significance level of outsider batch label distribution
vs expected frequencies.
If the significance level is lower than alpha
,
expected frequencies will be adapted
If the optimal neighbourhood size (k0) is smaller than 10, NA is returned.
batch <- rep(seq_len(10),each=20)
data <- matrix(rpois(n = 50000, lambda = 10)*rbinom(50000,1,prob=0.5), nrow=200)
batch.estimate <- kBET(data,batch)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.