load_dataset: Load dataset from file

Description Usage Arguments Value Author(s)

View source: R/preprocess.R

Description

Loads a dataset from file

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
load_dataset(
  filename,
  type = "tsv",
  cell_2_node_map = DEFAULT_CELL_2_NODE_MAP,
  drop_cols = 1L,
  rownames_col = 1L,
  excluded_samples = NULL,
  as_Matrix = TRUE,
  umi_to_upm = TRUE,
  verbose = FALSE
)

Arguments

filename

the name of the file containing the dataset (in tab-delimited format). In case of 10x data expects a path to a folder containing the 3 output files of cellranger (barcodes, features and matrix).

type

either tsv (tab-delimited text file), rds (binary R data file) or 10x (umi counts data). Default is tsv.

cell_2_node_map

a **function** that maps a vector of cell IDs to a vector of node IDs to which the cells belong. The default function assumes that the cell ID is a string separated by "-" and that the node ID is contained in the substring until the first "-" character.

drop_cols

number of columns that should be dropped from the dataset, i.e. if drop_cols == 3 then columns 1:3 will be dropped. Default is 1.

rownames_col

the column that contains the rownames, i.e. gene symbols or identifiers. Default is 1.

excluded_samples

a vector containing the names of the samples that should be excluded from the returned dataset. Defaults to NULL.

as_Matrix

logical indicating whether the loaded dataset should be returned as an S4 Matrix class (supports sparse representation) or base R matrix type. Default is TRUE.

umi_to_upm

Convert UMI counts to UMI per million units (e.g. CPM). Valid only for 10x data. Default is TRUE.

verbose

suppresses all messages from this function. Default is FALSE.

Value

A matrix containing the loaded expression data.

Author(s)

Avishay Spitzer


dravishays/scandal documentation built on Jan. 8, 2020, 1:30 p.m.