load_mm_data: Load data from matrix market format files.

View source: R/io.R

load_mm_dataR Documentation

Load data from matrix market format files.

Description

Load data from matrix market format files.

Usage

load_mm_data(
  mat_path,
  feature_anno_path,
  cell_anno_path,
  header = FALSE,
  feature_metadata_column_names = NULL,
  cell_metadata_column_names = NULL,
  umi_cutoff = 100,
  quote = "\"'",
  sep = "\t"
)

Arguments

mat_path

Path to the Matrix Market .mtx matrix file. The values are read and stored as a sparse matrix with nrows and ncols, as inferred from the file. Required.

feature_anno_path

Path to a feature annotation file. The feature_anno_path file must have nrows lines and at least one column. The values in the first column label the matrix rows and each must be distinct in the column. Values in additional columns are stored in the cell_data_set 'gene' metadata. For gene features, we urge use of official gene IDs for labels, such as Ensembl or Wormbase IDs. In this case, the second column has typically a 'short' gene name. Additional information such as gene_biotype may be stored in additional columns starting with column 3. Required.

cell_anno_path

Path to a cell annotation file. The cell_anno_path file must have ncols lines and at least one column. The values in the first column label the matrix columns and each must be distinct in the column. Values in additional columns are stored in the cell_data_set cells metadata. Required.

header

Logical set to TRUE if both feature_anno_path and cell_anno_path files have column headers, or set to FALSE if both files do not have column headers (only these cases are supported). The files may have either ncols or ncols-1 header fields. In both cases, the first column is used as the matrix dimension names. The default is FALSE.

feature_metadata_column_names

A character vector of feature metadata column names. The number of names must be one less than the number of columns in the feature_anno_path file. These values will replace those read from the feature_anno_path file header, if present. The default is NULL.

cell_metadata_column_names

A character vector of cell metadata column names. The number of names must be one less than the number of columns in the cell_anno_path file. These values will replace those read from the cell_anno_path file header, if present. The default is NULL.

umi_cutoff

UMI per cell cutoff. Columns (cells) with less than umi_cutoff total counts are removed from the matrix. The default is 100.

quote

A character string specifying the quoting characters used in the feature_anno_path and cell_anno_path files. The default is "\"'".

sep

field separator character in the annotation files. If sep = "", the separator is white space, that is, one or more spaces, tabs, newlines, or carriage returns. The default is the tab character for tab-separated-value files.

Value

cds object

Comments

  • load_mm_data estimates size factors.

Examples

  
    pmat<-system.file("extdata", "matrix.mtx.gz", package = "monocle3")
    prow<-system.file("extdata", "features_c3h0.txt", package = "monocle3")
    pcol<-system.file("extdata", "barcodes_c2h0.txt", package = "monocle3")
    cds <- load_mm_data( pmat, prow, pcol,
                         feature_metadata_column_names =
                         c('gene_short_name', 'gene_biotype'), sep='' )

    # In this example, the features_c3h0.txt file has three columns,
    # separated by spaces. The first column has official gene names, the
    # second has short gene names, and the third has gene biotypes.
  


cole-trapnell-lab/monocle3 documentation built on April 7, 2024, 9:24 p.m.