Read10x: Read 10x Space Ranger output data

read10xRawR Documentation

Read 10x Space Ranger output data

Description

read10xRaw() is a one-line handy function for reading the raw expression data from 10x Space Ranger outputs and producing a count matrix as an R object.

read10xRawH5() is for reading 10x Space Ranger output HDF5 file (ended with .h5).

read10xSlide() is for reading slide information (e.g. spot positions) and the tissue image from 10x Space Ranger outputs. This function is developed based on 10x's secondary analysis pipeline https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/rkit.

Usage

read10xRaw(count_dir = NULL, row_name = "symbol", meta = FALSE)

read10xRawH5(h5_file, row_name = "symbol", meta = FALSE)

read10xSlide(tissue_csv_file, tissue_img_file = NULL, scale_factor_file = NULL)

Arguments

count_dir

(chr) The directory of 10x output matrix data. The directory should include three files: barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz.

row_name

(chr) Specify either using gene symbols (row_name = "symbol") or gene Ensembl IDs (row_name = "id") as row names of the count matrix. Default: row_name = "symbol"

meta

(logical) If TRUE, read10xRaw or read10xRawH5 returns a list containing both the count matrix and metadata of genes (features). Metadata includes feature names, IDs and other additional information depending on Space Ranger output. If FALSE, only returns the count matrix. Default: FALSE

h5_file

(chr) The path of 10x output matrix HDF5 file (ended with .h5).

tissue_csv_file

(chr) The path of 10x output CSV file of spot positions, usually named tissue_positions_list.csv for Space Ranger V1 and tissue_positions.csv for Space Ranger V2.

tissue_img_file

(chr) The path of the 10x output low resolution tissue image in PNG format, usually named tissue_lowres_image.png. If NULL, the returned slide data does not contain image information. Please do provide this file if you could find it. Default: NULL

scale_factor_file

(chr) The path of the 10x output scale factor file in json format, usually named scalefactors_json.json. If NULL, spot positions in image will not be corrected by the scale factor. Please do provide this file if you could find it. Default: NULL

Value

If meta = TRUE, read10xRaw() or read10xRawH5() returns a list of two elements: a "dgCMatrix" sparse matrix containing expression counts and a data frame containing metadata of genes (features). For the count matrix, each row is a gene (feature) and each column is a spot barcode. If meta = FALSE, only returns the count matrix.

read10xSlide() returns a list of two objects. The first object, slide, is a data.frame where each row corresponds to a spot and each column corresponds to slide information such as row and column positions on the slide. The second object, grob, is a Grid Graphical Object of the tissue image when specifying tissue_img_file.

Examples


# simulate 10x output files of count matrix
data(mbrain_raw)
data_dir <- file.path(tempdir(),"sim_example")
dir.create(data_dir)
matrix_dir <- file.path(data_dir,"matrix.mtx")
barcode_dir <- gzfile(file.path(data_dir, "barcodes.tsv.gz"), open="wb")
gene_dir <- gzfile(file.path(data_dir, "features.tsv.gz"), open="wb")

# For simplicity, use gene names to generate gene IDs to fit the format.
gene_name <- rownames(mbrain_raw)
gene_id <- paste0("ENSG_fake_",gene_name)
barcode_id <- colnames(mbrain_raw)

Matrix::writeMM(mbrain_raw,file = matrix_dir)
write(barcode_id,file = barcode_dir)
write.table(cbind(gene_id,gene_name,"type"),file = gene_dir,
    sep = "\t", quote = FALSE, col.names = FALSE, row.names = FALSE)
R.utils::gzip(matrix_dir)
close(barcode_dir)
close(gene_dir)


# read expression count matrix
list.files(data_dir)
mbrain_raw_new <- read10xRaw(data_dir)
str(mbrain_raw_new)
identical(mbrain_raw, mbrain_raw_new)

# read slide metadata
spatial_dir <- system.file(file.path("extdata",
                                     "V1_Adult_Mouse_Brain_spatial"),
                           package = "SpotClean")
list.files(spatial_dir)
mbrain_slide_info <- read10xSlide(tissue_csv_file=file.path(spatial_dir,
                                       "tissue_positions_list.csv"),
             tissue_img_file = file.path(spatial_dir,
                                       "tissue_lowres_image.png"),
             scale_factor_file = file.path(spatial_dir,
                                       "scalefactors_json.json"))
str(mbrain_slide_info)

zijianni/SpotClean documentation built on Nov. 15, 2023, 12:53 a.m.