score: Score matrix columns for groups of rows

Description Usage Arguments Value

View source: R/score.R

Description

Score a matrix by groups of rows. Scores are column averages for the subset of rows specified in a group. Option to generate control group scores to subtract from the real scores. Control groups can either be user-provided or generated from binning of the rows. Similarly, the bins themselves can be user-provided or computed.

Usage

1
2
score(mat, groups, binmat = NULL, bins = NULL, controls = NULL,
  bin.control = F, center = F, nbin = 30, n = 100, replace = F)

Arguments

mat

an expression matrix of gene rows by cell columns.

groups

a character vector or list of character vectors. Each character vector is a group or signature to score each column against and should match subsets of rownames in <mat>.

binmat

an expression matrix of gene rows by cell columns that will be used to create the gene bins and ultimately the control signatures for correction of the cell scores. For our use cases, <mat> and <binmat> are identical except that the former is row-centered and used to generate cell scores and the latter is not row-centered and used to correct the cell scores. If NULL, and bin.control = T (and neither <bins> nor <controls> were provided), <mat> will be used. Careful that in this use case <mat> should not be row-centered for the correction to be meaningful. Default: NULL

bins

a named character vector with as names the rownames and as values the ID of each bin. You can provide the bins directly (e.g. with bin()) rather than these being generated from <binmat>. Default: NULL

bin.control

boolean value. If your controls can be generated straight from <mat> (i.e. if mat is not row-centered and you do not provide <binmatch>, <bins>, or <controls>), then you can just call score(mat, groups, bin.control = TRUE). Default: F

center

boolean value. Should the resulting score matrix be column-centered? This option should be considered if binned controls are not used. Default: F

nbin

numeric value specifying the number of bins. Not relevant if <bins> or <controls> are provided on input. Default is 30, but please be aware that we chose 30 bins for ~ 8500 genes and if your # of genes is very different you should consider changing this. Default: 30

n

numeric value for the number of control genes to be sampled per gene in a signature. Not relevant if <controls> is provided on input. Default: 100

replace

boolean value. Allow bin sampling to be done with replacement. Default: F

controls.

A character vector if <groups> is a character vector a list of character vectors of the same length as <groups>. Each character vector is a control signature whose genes should have expression levels similar to those in the corresponding real signature, but be otherwise biologically meaningless. You can provide the control signatures directly (e.g. with binmatch()) rather than these being generated from <binmatch> / <bins>. Default: NULL

Value

a matrix with as rows the columns of the input matrix and as columns the scores of each group provided in groups. If one group is provided, the matrix returned will have 1 column.


jlaffy/scrabble documentation built on Nov. 16, 2019, 7:56 a.m.