annotate.targets: title

Description Usage Arguments Value Author(s)

View source: R/fishhook.R


Takes input of GRanges targets, an optional set of "covered" intervals, and an indefinite list of covariates which can be R objects (GRanges, ffTrack, Rle) or file paths to .rds, .bw, .bed files, and an annotated target intervals GRanges with covariates computed for each interval. These target intervals can be further annotated with mutation counts and plugged into a generalized linear regression (or other) model downstream.

There are three types of covariates: numeric, sequence, interval. The covariates are computed as follows: numeric covariates: the mean value sequence covarites: fraction of bases satisfying $signature interval covariates: fraction of bases overlapping feature


annotate.targets(targets, covered = NULL, events = NULL, mc.cores = 1,
  na.rm = TRUE, pad = 0, verbose = TRUE, max.slice = 1000,
  ff.chunk = 1e+06, max.chunk = 1e+11, out.path = NULL,
  covariates = list(), maxpatientpergene = Inf, ptidcol = NULL,
  weightEvents = FALSE, ...)



path to bed or rds containing genomic target regions with optional target name


optional path to bed or rds containing granges object containing "covered" genomic regions (default = NULL)


optional path to bed or rds containing ranges corresponding to events (ie mutations etc) (default = NULL)


integer info (default = 1)


info (default = TRUE)


info (default = 0)


boolean verbose flag (default = FALSE)


integer Max slice of intervals to evaluate with gr.val (default = 1e3)


integer Max chunk to evaluate with fftab (default = 1e6)


integer gr.findoverlaps parameter (default = 1e11)


out.path to save variable to (default = NULL)


list of lists where each internal list represents a covariate, the internal list can have elements: track, type,signature,name,pad,na.rm = na.rm,field,grep. See Cov_Arr class for descriptions of what each of these elements do. Note that track is equivalent to the 'Covariate' parameter in Cov_Arr


Sets the maximum number of events a patient can contribute per target (default = Inf)


string Column where patient ID is stored


paths to sequence covariates whose output names will be their argument names, and each consists of a list with (default = FALSE) $track field corresponding to a GRanges, RleList, ffTrack object (or path to rds containing that object), $type which can have one of three values "numeric", "sequence", "interval". Numeric tracks must have $score field if they are GRanges), and can have a $na.rm logical field describing how to treat NA values (set to na.rm argument by default) Sequence covariates must be ffTrack objects (or paths to ffTrack rds) and require an additional variables $signatures, which will be used as input to fftab, and can have optional logical argument $grep to specify inexact matches (see fftab) fftab signature: signatures is a named list that specify what is to be tallied. Each signature (ie list element) consist of an arbitrary length character vector specifying strings to or length 1 character vector to grepl (if grep = TRUE) or a length 1 or 2 numeric vector specifying exact value or interval to match (for numeric data) Every list element of signature will become a metadata column in the output GRanges specifying how many positions in the given interval match the given query

Interval covariates must be Granges (or paths to GRanges rds) or paths to bed files


boolean If TRUE, will weight events by their overlap with targets. e.g. if 10 region, that target region will get assigned a score of 0.1 for that event. If false, any overlap will be given a weight of 1.


GRanges of input targets annotated with covariate statistics (+/- constrained to the subranges in optional argument covered)


Marcin Imielinski

mskilab/fish.hook documentation built on Feb. 20, 2018, 4:23 p.m.