Description Usage Arguments Details Value Note Examples
View source: R/01_filter_count_matrix.R
This function loads and filters the input data for the
subsequent steps. The data loading and cleaning function is very basic, but
data input is critical to the package working correctly. If no input data
is given, the package defaults to using normal mucosal cells data for the
simulation and power calculations (see alpha_cells
).
1 | filter_counts(expr = alpha_cells, gene_thresh = 0, cell_thresh = 0)
|
expr |
a data.frame where the unique cell identifier is in column one and the sample identifier is in column two with the remaining columns all being genes. |
gene_thresh |
the mean expression threshold for retaining genes. Defaults to 0. |
cell_thresh |
the mean expression threshold for retaining cells. Defaults to 0. |
Input data should be formatted as follows:
Cell_ID | Individual_ID | Gene1 | Gene2 | Gene3 | ... |
Cell1_Ind1 | Ind1 | 12 | 24 | 0 | ... |
Cell2_Ind1 | Ind1 | 11 | 2 | 0 | ... |
Cell3_Ind1 | Ind1 | 10 | 0 | 0 | ... |
Cell4_Ind1 | Ind1 | 0 | 124 | 10 | ... |
Cell1_Ind2 | Ind2 | 9 | 37 | 18 | ... |
Cell2_Ind2 | Ind2 | 0 | 29 | 0 | ... |
Where the unique cell identifier is in column one and the sample identifier is in column two with the remaining columns all being genes.
a data.frame that has filtered out cells with mean count = 0 and genes with mean count = 0
Data should be only for cells of the specific cell-type you are interested in simulating or computing power for. Data should also contain as many unique sample identifiers as possible. If you are inputing data that has less than 5 unique values for sample identifier (i.e., independent experimental units), then the empirical estimation of the inter-individual heterogeneity is going to be very unstable. Finding such a dataset will be difficult at this time, but, over time (as experiments grow in sample size and the numbers of publically available single-cell RNAseq datasets increase), this should improve dramatically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | n_genes <- 10
n_cells <- 10
make_data <- function(x){
mu_random <- round(rgamma(n=1, shape=1, rate=0.001),0)
size_random <- runif(n=1, min=0, max=3)
rnbinom(n_cells, size=size_random, mu=mu_random)
}
expr_dat <- as.data.frame(replicate(n_genes,make_data()))
expr_dat$CellID <- paste0("Cell",1:n_cells)
expr_dat$IND <- "IND1"
expr_dat <- expr_dat[,c(11,12,1:10)]
clean_expr_data <- filter_counts(expr_dat)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.