get_normalized_expression_matrix: Process cellranger output
In robAndrewCarter/rnaseqUtils: Utility Functions for RNASeq Data Analysis

Read cellranger output directory, filter the data, and return normalized expression matrix

1 2	get_normalized_expression_matrix(run_to_path_df = data.frame(), .genome = c("mm10", "hg19"), umi_limits = c(3000, Inf))

`run_to_path_df:`	A dataframe with two columns (1) run_name (character): The name of the cellranger run, such as patient_id, sample_date, etc... (2) cellranger_path (character): The path to the toplevel cellranger output
`.genome:`	a character, typically 'mm10', or 'hg19'.
`umi_limits:`	vector of length 2, with lower and upper bounds for umi_counts.

This function reads a series of cellranger output directories. It merges the results together into a large matrix anbd then filters out cells with UMI counts outside of the range specified in umi_limits. Mitochondrial and ribosomal protein-coding genes are then removed from the matrix, as are ENSEMBL IDs with no expression across the remaining cells. Finally, the data is scaled using a global scaling factor (total UMIs per cell) and log2-transformed.

robAndrewCarter/rnaseqUtils documentation built on May 22, 2019, 12:55 p.m.