View source: R/SummarizedExperiment_helpers.R
makeSummarizedExperimentFromBam | R Documentation |
Make a summerizedExperiment / matrix object from bam files or other library formats sepcified by lib.type argument. Works like HTSeq, to give you count tables per library.
makeSummarizedExperimentFromBam(
df,
saveName = NULL,
longestPerGene = FALSE,
geneOrTxNames = "tx",
region = "mrna",
type = "count",
lib.type = "ofst",
weight = "score",
forceRemake = FALSE,
force = TRUE,
library.names = bamVarName(df),
BPPARAM = BiocParallel::SerialParam()
)
df |
an ORFik |
saveName |
a character (default NULL), if set save experiment to path given. Always saved as .rds., it is optional to add .rds, it will be added for you if not present. Also used to load existing file with that name. |
longestPerGene |
a logical (default FALSE), if FALSE all transcript isoforms per gene. Ignored if "region" is not a character of either: "mRNA","tx", "cds", "leaders" or "trailers". |
geneOrTxNames |
a character vector (default "tx"), should row names keep trancript names ("tx") or change to gene names ("gene") |
region |
a character vector (default: "mrna"), make raw count matrices
of whole mrnas or one of (leaders, cds, trailers).
Can also be a |
type |
default: "count" (raw counts matrix), alternative is "fpkm", "log2fpkm" or "log10fpkm" |
lib.type |
a character(default: "default"), load files in experiment or some precomputed variant, either "ofst", "bedo", "bedoc" or "pshifted". These are made with ORFik:::convertLibs() or shiftFootprintsByExperiment(). Can also be custom user made folders inside the experiments bam folder. |
weight |
numeric or character, a column to score overlaps by. Default "score", will check for a metacolumn called "score" in libraries. If not found, will not use weights. |
forceRemake |
logical, default FALSE. If TRUE, will not look for existing file count table files. |
force |
logical, default TRUE If TRUE, reload library files even if
matching named variables are found in environment used by experiment
(see |
library.names |
character, default: bamVarName(df). Names to load libraries as to environment and names to display in plots. |
BPPARAM |
how many cores/threads to use? default: BiocParallel::SerialParam() |
If txdb or gtf path is added, it is a rangedSummerizedExperiment
NOTE: If the file called saveName exists, it will then load file,
not remake it!
There are different ways of counting hits on transcripts, ORFik does
it as pure coverage (if a single read aligns to a region with 2 genes, both
gets a count of 1 from that read).
This is the safest way to avoid false negatives
(genes with no assigned hits that actually have true hits).
a SummarizedExperiment
object or data.table if
"type" is not "count, with rownames as transcript / gene names.
##Make experiment
df <- ORFik.template.experiment()
# makeSummarizedExperimentFromBam(df)
## Only cds (coding sequences):
# makeSummarizedExperimentFromBam(df, region = "cds")
## FPKM instead of raw counts on whole mrna regions
# makeSummarizedExperimentFromBam(df, type = "fpkm")
## Make count tables of pshifted libraries over uORFs
uorfs <- GRangesList(uorf1 = GRanges("chr23", 17599129:17599156, "-"))
#saveName <- file.path(dirname(df$filepath[1]), "uORFs", "countTable_uORFs")
#makeSummarizedExperimentFromBam(df, saveName, region = uorfs)
## To load the uORFs later
# countTable(df, region = "uORFs", count.folder = "uORFs")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.