ReadPeakCounts: Read in peak data saved in MEX format

View source: R/data_util.R

ReadPeakCountsR Documentation

Read in peak data saved in MEX format

Description

Read in peak data saved in MEX format. Files can be in a gzipped (.gz) format.

Usage

ReadPeakCounts(
  data.dir = NULL,
  mm.file = NULL,
  barcodes.file = NULL,
  sites.file = NULL
)

Arguments

data.dir

directory where output from CountPeaks is stored

mm.file

count matrix in MEX format

barcodes.file

file containing cell barcodes corresponding to columns in the matrix

sites.file

file containing peak coordinate names corresponding to rows in the matrix

Value

a sparseMatrix

Examples

# Following commands can be used to generate a new random sample data set
# barcode_seq <- stringi::stri_rand_strings(12,14,pattern="[ACTG]")
# barcode_seq <- paste0(barcode_seq,"-1")
# Below is hard coded example

barcode_seq <- c("TCCCAGTACTGGGC-1", "CCAGAGAAAAACTT-1", "CGATAGGGGTAACA-1", 
"GGCGGATGGAGATT-1", "ATCAGTACATCTAT-1", "TTTCCCGTACCACA-1", "TTGTGTACGGGATG-1", 
"CAGGGCATAGTCTA-1", "GCTCTTTGGCTGAG-1", "AGTCGTATCACTAA-1", "CGGTTGGCTGGTAT-1", 
"TGACCTGGAGCTGC-1")

# Note: siteNames could be genes
siteNames <- cbind( paste0("Gene_",letters[1:12]))
                 
 # For this working example set site_names to be peak coordinates                
siteNames <- c("Sash1:10:8722219-8722812:-1", "Sash1:10:8813689-8814157:-1", 
             "Lamp2:X:38419489-38419901:-1", "Lamp2:X:38405042-38405480:-1", 
             "Lamp2:X:38455818-38456298:-1", "Pecam1:11:106654217-106654585:-1", 
             "Ly6e:15:74958936-74959338:1", "Ly6e:15:74956076-74956512:1", 
             "Pnkd:1:74285960-74287456:1", "Pdgfra:5:75197715-75198215:1", 
             "Dlc1:8:36567751-36568049:-1", "Dlc1:8:36568379-36568865:-1")

# Randomly generate a matrix that contains a bunch of zeros.
# Columns are cells, rows are 
matrix_A <- matrix(round(rexp(144,rate = 1),digits = 0), nrow = 12,ncol = 12)
matrix_B <- matrix(round(rexp(144,rate = 0.7),digits = 0), nrow = 12,ncol = 12)
matrix_mtx <- matrix_A * matrix_B
matrix_mtx <- Matrix::Matrix(matrix_mtx, sparse=TRUE)

# Save example to appropriate named files in temporary location
data.dir <- tempdir()
barcodes.file <- paste0(data.dir,"/barcodes.tsv")
writeLines(barcode_seq, barcodes.file)
mm.file <- paste0(data.dir,"/matrix.mtx")
Matrix::writeMM(matrix_mtx, mm.file)
sites.file <- paste0(data.dir,"/sitenames.tsv")
writeLines(siteNames,sites.file)

# Now read in using Sierra ReadPeakCounts by passing just directory name
count.matrix <- Sierra::ReadPeakCounts(data.dir=data.dir)  

# Or by passing full length file names
count.matrix <- Sierra::ReadPeakCounts(barcodes.file=barcodes.file, mm.file=mm.file, sites.file=sites.file)   
 
 

VCCRI/Sierra documentation built on July 3, 2023, 6:39 a.m.