LocASN | R Documentation |
A function of normalizing single cell RNA-seq gene expression.
LocASN(
countmatrix,
conditions = NULL,
filter = FALSE,
gene_num_gezero = 3,
cell_num_gezero = 10,
numGeneforEst = 2000,
divideforFast = TRUE,
numDivide = NULL,
bw.method = "SJ",
cutoff = 2
)
countmatrix |
Input. Unnormalized count matrix (genes by cells). |
conditions |
Input (Optional). Condition/sample number of each cell. The default = NULL, denoting all cells are from the same condition/sample. |
filter |
Input (Optional). A logic value to indicate if need data filtering. If TRUE, please see the details of gene_num_gezero and cell_num_gezero for input. The default value is FALSE. |
gene_num_gezero |
Input (Optional). A threshold (integer) to determine the inclusion of a gene. The gene included needs to be expressed in at least gene_num_gezero cells. The default value is 3. |
cell_num_gezero |
Input (Optional). A threshold (integer) to determine the inclusion of a cell. The cell included needs to contain at least cell_num_gezero expressed genes. The default value is 10. |
numGeneforEst |
Input (Optional). Use top numGeneforEst (integer) genes according to the proportions of gene counts > 0 in cells to estimate the scaling factors, for speeding up computation. |
divideforFast |
Input (Optional). A logic value to indicate if speeding up computation by randomly dividing cells in each condition into numDivide smaller groups. Please input an integer in numDivide below if divideforFast = TRUE. The default value is TRUE. |
numDivide |
Input (Optional). An integer is required if divideforFast = TRUE. The cells in each condition will be randomly divided by numDivide small groups. The default numDivide = NULL will automatically use the maximum of 1 and the smallest integer that is not less than the number of cells in each condition divided by 3K, that means no division for conditions with less than 3K cells. |
bw.method |
Input (Optional). A method to estimate the bandwidths in Kernel weighting. The default method uses "SJ" (SJ bandwidth, Sheather and Jones, 1991). Otherwise, uses "RoT" (rule-of-thumb, Silverman, 1986). |
cutoff |
Input (Optional). To be more computationally efficient, low weights will be set to zeros when cell distances are larger than cutoff times bandwidths. The default value = 2. |
NormalizedData |
Matrix (genes by cells). Data matrix after normalization. |
scalingFactor |
Vector. Cell-specific scaling factors. |
delete_genes |
Vector. Indeice of the genes deleted. |
delete_cells |
Vector. Indeice of the cells deleted. |
set.seed(12345)
G <- 2000; n <- 600 # G: number of genes, n: number of cells
mu <- rgamma(G, shape = 2, rate = 2)
NB_cell <- function(j) rnbinom(G, size = 0.1, mu = mu)
countsimdata <- sapply(1:n, NB_cell)
colnames(countsimdata) <- paste("c", 1:n, sep = "_")
rownames(countsimdata) <- paste("g", 1:G, sep = "_")
Result <- LocASN(countmatrix = as(countsimdata,"sparseMatrix"))
Result$NormalizedData[1:10,1:10]; Result$scalingFactor[1:10]
#conditions <- c(rep(1,n/2), rep(2,n/2))
#Result2 <- LocASN(countmatrix = countsimdata, conditions = conditions)
#Result2$NormalizedData[1:10,1:10]; Result2$scalingFactor[1:10]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.