SplitBam | R Documentation |
Given a bam file that was processed by CellRanger, splitBam splits the bam into multiple bam files, one per cell barcode. Bam file needs to have the barcode stored in the "CB" field.
SplitBam(
bam,
cellbc.df,
outdir = NULL,
yieldSize = 1e+06,
gtf_gr = NULL,
geneSymbol = NULL,
gi_ext = 50,
rle_output = FALSE,
exportFastqHeader = FALSE,
genomicRegion = NULL,
bamTags = c("CB", "UB"),
what = c("qname", "flag", "rname", "strand", "pos")
)
bam |
CellRanger outputted bam file with the CB field |
cellbc.df |
data frame of the cell barcode, needs to have the column names: "celltype" and "cellbc" |
outdir |
directory to output the bam files. The bam files will be called [celltype].bam. If NULL no BAM file created. |
yieldSize |
number of lines of bam files to load. Default: 1000000 |
gtf_gr |
gene model genomic ranges. Only used if geneSymbol is defined. |
geneSymbol |
Gene symbol. Used to identify the genomic coordinates to extract reads from. |
gi_ext |
The number of nucleotides to extend the genomic interval in extracting reads from (default 50). |
rle_output |
If TRUE will generate and return rle_list object |
exportFastqHeader |
If TRUE will generate a txt output file that has same prefix as bam file containing fastq header IDs |
genomicRegion |
Granges object of genomic region to extract. Only used if geneSymbol not defined. |
bamTags |
BAM field tag identifiers to extract. Default is c("CB", "UB"). |
what |
What BAM fields to copy into new file. Default is c('qname', 'flag', 'rname', 'strand', 'pos') |
a rleList of coverage for each cell type
library('Sierra')
# Example 1 split the entire BAM file for each cell type
## Not run:
extdata_path <- system.file("extdata",package = "scpolya")
load(paste(extdata_path,"TIP_vignette_gene_Seurat.RData",sep="/"))
cellbc.df <- data.frame(celltype=genes.seurat@active.ident,
cellbc= names(genes.seurat@active.ident))
bamfile <- c(paste0(extdata_path,"/Vignette_example_TIP_sham.bam")
SplitBam(bam, cellbc.df)
## End(Not run)
# Example 2 extract reads that overlap a gene
extdata_path <- system.file("extdata",package = "Sierra")
gtf.file <- paste0(extdata_path,"/Vignette_cellranger_genes_subset.gtf")
gtf.gr <- rtracklayer::import(gtf.file)
load(paste(extdata_path,"TIP_vignette_gene_Seurat.RData",sep="/"))
cellbc.df <- data.frame(celltype=genes.seurat@active.ident,
cellbc= names(genes.seurat@active.ident))
# Modify cellbc.df so that the barcodes match what is in the BAM file
cellbc.df$cellbc <- sub("(.*)-.*", "\\1", cellbc.df$cellbc)
cellbc.df$cellbc <- paste0(cellbc.df$cellbc, "-1")
bam.file <- paste0(extdata_path,"/Vignette_example_TIP_mi.bam")
outdir <- tempdir() # change this to a meaningful location
SplitBam(bam.file, cellbc.df, outdir=outdir, gtf_gr=gtf.gr, geneSymbol="Dnajc19")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.