convert_se_gene_ids: convert_se_gene_ids

Description Usage Arguments Value See Also Examples

View source: R/loading_helper_functions.r

Description

Change the gene IDs in in the supplied datatset_se object to some other id already present in the gene info (as seen with rowData())

Usage

1
convert_se_gene_ids(dataset_se, new_id, eval_col, find_max = TRUE)

Arguments

dataset_se

Summarised experiment object containing count data. Also requires 'ID' and 'group' to be set within the cell information (see colData())

new_id

A column within the feature information (view colData(dataset_se))) of the dataset_se, which will become the new ID column. Non-uniqueness of this column is handled gracefully! Any NAs will be dropped.

eval_col

Which column to use to break ties of duplicate new_id. Must be a column within the feature information (view colData(dataset_se))) of the dataset_se. Total reads per gene feature is a good choice.

find_max

If false, this will choose the minimal eval_col instead of max. Default = TRUE

Value

A modified dataset_se - ID will now be new_id, and unique. It will have fewer genes if old ID to new ID was not a 1:1 mapping. The selected genes will be according to the eval col max (or min). should pick the alphabetical first on ties, but could change.

See Also

SummarizedExperiment For general doco on the SummarizedExperiment objects.

load_se_from_files For reading data from flat files (not 10X cellRanger output)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# The demo dataset doesn't have other names, so make some up 
# (don't do this)
dataset_se <- demo_ref_se
rowData(dataset_se)$dummyname <- toupper(rowData(dataset_se)$ID)

# If not already present, define a column to evaluate, 
# typically total reads/gene.
rowData(dataset_se)$total_count <- rowSums(assay(dataset_se))

dataset_se <- convert_se_gene_ids(dataset_se, new_id='dummyname', eval_col='total_count') 

celaref documentation built on Nov. 8, 2020, 5:03 p.m.