filter_counts: Filter gene expression matrix
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description Usage Arguments Details Value Note Examples

This function loads and filters the input data for the subsequent steps. The data loading and cleaning function is very basic, but data input is critical to the package working correctly. If no input data is given, the package defaults to using normal mucosal cells data for the simulation and power calculations (see alpha_cells).

1	filter_counts(expr = alpha_cells, gene_thresh = 0, cell_thresh = 0)

`expr`	a data.frame where the unique cell identifier is in column one and the sample identifier is in column two with the remaining columns all being genes.
`gene_thresh`	the mean expression threshold for retaining genes. Defaults to 0.
`cell_thresh`	the mean expression threshold for retaining cells. Defaults to 0.

Input data should be formatted as follows:

Cell_ID	Individual_ID	Gene1	Gene2	Gene3	...
Cell1_Ind1	Ind1	12	24	0	...
Cell2_Ind1	Ind1	11	2	0	...
Cell3_Ind1	Ind1	10	0	0	...
Cell4_Ind1	Ind1	0	124	10	...
Cell1_Ind2	Ind2	9	37	18	...
Cell2_Ind2	Ind2	0	29	0	...

Where the unique cell identifier is in column one and the sample identifier is in column two with the remaining columns all being genes.

a data.frame that has filtered out cells with mean count = 0 and genes with mean count = 0

Data should be only for cells of the specific cell-type you are interested in simulating or computing power for. Data should also contain as many unique sample identifiers as possible. If you are inputing data that has less than 5 unique values for sample identifier (i.e., independent experimental units), then the empirical estimation of the inter-individual heterogeneity is going to be very unstable. Finding such a dataset will be difficult at this time, but, over time (as experiments grow in sample size and the numbers of publically available single-cell RNAseq datasets increase), this should improve dramatically.

n_genes <- 10
n_cells <- 10

make_data <- function(x){
 mu_random <- round(rgamma(n=1, shape=1, rate=0.001),0)
 size_random <- runif(n=1, min=0, max=3)
 rnbinom(n_cells, size=size_random, mu=mu_random)
}

expr_dat <- as.data.frame(replicate(n_genes,make_data()))
expr_dat$CellID <- paste0("Cell",1:n_cells)
expr_dat$IND <- "IND1"
expr_dat <- expr_dat[,c(11,12,1:10)]
clean_expr_data <- filter_counts(expr_dat)

kdzimm/hierarchicell documentation built on Dec. 21, 2021, 5:23 a.m.

kdzimm/hierarchicell index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kdzimm/hierarchicell
Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

filter_counts: Filter gene expression matrix
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description

Usage

Arguments

Details

Value

Note

Examples

Related to filter_counts in kdzimm/hierarchicell...

R Package Documentation

Browse R Packages

We want your feedback!

kdzimm/hierarchicell Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

filter_counts: Filter gene expression matrix In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

Description

Usage

Arguments

Details

Value

Note

Examples

Related to filter_counts in kdzimm/hierarchicell...

R Package Documentation

Browse R Packages

We want your feedback!

kdzimm/hierarchicell
Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data

filter_counts: Filter gene expression matrix
In kdzimm/hierarchicell: Simulating Cell-Type Specific and Hierarchical Single-Cell RNA-Seq Data