outputLibs: Output NGS libraries to R as variables
In Roleren/ORFik: Open Reading Frames in Genomics

outputLibs

R Documentation

Output NGS libraries to R as variables

Description

By default loads the original files of the experiment into the global environment, named by the rows of the experiment required to make all libraries have unique names.
Uses multiple cores to load, defined by multicoreParam

Usage

outputLibs(
  df,
  type = "default",
  paths = filepath(df, type),
  param = NULL,
  strandMode = 0,
  naming = "minimum",
  library.names = name_decider(df, naming),
  output.mode = "envir",
  chrStyle = NULL,
  envir = envExp(df),
  verbose = TRUE,
  force = TRUE,
  validate_libs = TRUE,
  BPPARAM = bpparam()
)

Arguments

`df`	an ORFik `experiment`
`type`	a character(default: "default"), load files in experiment or some precomputed variant, like "ofst" or "pshifted". These are made with ORFik:::convertLibs(), shiftFootprintsByExperiment(), etc. Can also be custom user made folders inside the experiments bam folder. It acts in a recursive manner with priority: If you state "pshifted", but it does not exist, it checks "ofst". If no .ofst files, it uses "default", which always must exists. Presets are (folder is relative to default lib folder, some types fall back to other formats if folder does not exist): - "default": load the original files for experiment, usually bam. - "ofst": loads ofst files from the ofst folder, relative to lib folder (falls back to default) - "pshifted": loads ofst, wig or bigwig from pshifted folder (falls back to ofst, then default) - "cov": Load covRle objects from cov_RLE folder (fail if not found) - "covl": Load covRleList objects, from cov_RLE_List folder (fail if not found) - "bed": Load bed files, from bed folder (falls back to default) - Other formats must be loaded directly with fimport
`paths`	character vector, the filpaths to use, default `filepath(df, type)`. Change type argument if not correct. If that is not enough, then you can also update this argument. But be careful about using this directly.
`param`	`NULL` or a ScanBamParam object. Like for `scanBam`, this influences what fields and which records are imported. However, note that the fields specified thru this ScanBamParam object will be loaded in addition to any field required for generating the returned object (GAlignments, GAlignmentPairs, or GappedReads object), but only the fields requested by the user will actually be kept as metadata columns of the object. By default (i.e. `param=NULL` or `param=ScanBamParam()`), no additional field is loaded. The flag used is `scanBamFlag(isUnmappedQuery=FALSE)` for `readGAlignments`, `readGAlignmentsList`, and `readGappedReads`. (i.e. only records corresponding to mapped reads are loaded), and `scanBamFlag(isUnmappedQuery=FALSE, isPaired=TRUE, hasUnmappedMate=FALSE)` for `readGAlignmentPairs` (i.e. only records corresponding to paired-end reads with both ends mapped are loaded).
`strandMode`	numeric, default 0. Only used for paired end bam files. One of (0: strand = *, 1: first read of pair is +, 2: first read of pair is -). See ?strandMode. Note: Sets default to 0 instead of 1, as readGAlignmentPairs uses 1. This is to guarantee hits, but will also make mismatches of overlapping transcripts in opposite directions.
`naming`	a character (default: "minimum"). Name files as minimum information needed to make all files unique. Set to "full" to get full names. Set to "fullexp", to get full name with experiment name as prefix, the last one guarantees uniqueness.
`library.names`	character vector, names of libraries, default: name_decider(df, naming)
`output.mode`	character, default "envir". Output libraries to environment. Alternative: "list", return as list. "envirlist", output to envir and return as list. If output is list format, the list elements are named from: `bamVarName(df.rfp)` (Full or minimum naming based on 'naming' argument)
`chrStyle`	a GRanges object, TxDb, FaFile, , a `seqlevelsStyle` or `Seqinfo`. (Default: NULL) to get seqlevelsStyle from. In addition if it is a Seqinfo object, seqinfo will be updated. Example of seqlevelsStyle update: Is chromosome 1 called chr1 or 1, is mitocondrial chromosome called MT or chrM etc. Will use 1st seqlevel-style if more are present. Like: c("NCBI", "UCSC") -> pick "NCBI"
`envir`	environment to save to, default `envExp(df)`, which defaults to .GlobalEnv, but can be set with `envExp(df) <- new.env()` etc.
`verbose`	logical, default TRUE, message about library output status.
`force`	logical, default TRUE If TRUE, reload library files even if matching named variables are found in environment used by experiment (see `envExp`) A simple way to make sure correct libraries are always loaded. FALSE is faster if data is loaded correctly already.
`validate_libs`	logical, default TRUE. If FALSE, don't check that default files exists (i.e. bam files), useful if you are using pshifted ofst etc and don't have the bams anymore.
`BPPARAM`	how many cores/threads to use? default: bpparam(). To see number of threads used, do `bpparam()$workers`. You can also add a time remaining bar, for a more detailed pipeline.

Details

The functions checks if the total set of libraries have already been loaded: i.e. Check if all names from 'library.names' exists as S4 objects in environment of experiment.

Value

NULL (libraries set by envir assignment), unless output.mode is "list" or "envirlist": Then you get a list of the libraries.

Examples

## Load a template ORFik experiment
df <- ORFik.template.experiment()
## Default library type load, usually bam files
outputLibs(df, type = "default")
RFP_WT_r1
attr(RFP_WT_r1, "filepath")
attr(RFP_WT_r1, "exp")
## .ofst file load, if ofst files does not exists
## it will load default
# outputLibs(df, type = "ofst")
## .wig file load, if wiggle files does not exists
## it will load default
# outputLibs(df, type = "wig")
## Load as list
outputLibs(df, output.mode = "list")
## Load libs to new environment (called ORFik in Global)
# outputLibs(df, envir = assign(name(df), new.env(parent = .GlobalEnv)))
## Load to hidden environment given by experiment
# envExp(df) <- new.env()
# outputLibs(df)

Roleren/ORFik documentation built on April 12, 2025, 5:31 a.m.