annotate.targets: title

Description Usage Arguments Value Author(s)

View source: R/fishhook.R

Description

Takes input of GRanges targets, an optional set of "covered" intervals, and an indefinite list of covariates which can be R objects (GRanges, ffTrack, Rle) or file paths to .rds, .bw, .bed files, and an annotated target intervals GRanges with covariates computed for each interval. These target intervals can be further annotated with mutation counts and plugged into a generalized linear regression (or other) model downstream.

There are three types of covariates: numeric, sequence, interval. The covariates are computed as follows: numeric covariates: the mean value sequence covarites: fraction of bases satisfying $signature interval covariates: fraction of bases overlapping feature

Usage

1
2
3
4
5
annotate.targets(targets, covered = NULL, events = NULL, mc.cores = 1,
  na.rm = TRUE, pad = 0, verbose = TRUE, max.slice = 1000,
  ff.chunk = 1e+06, max.chunk = 1e+11, out.path = NULL,
  covariates = list(), maxpatientpergene = Inf, ptidcol = NULL,
  weightEvents = FALSE, ...)

Arguments

targets

path to bed or rds containing genomic target regions with optional target name

covered

optional path to bed or rds containing granges object containing "covered" genomic regions (default = NULL)

events

optional path to bed or rds containing ranges corresponding to events (ie mutations etc) (default = NULL)

mc.cores

integer info (default = 1)

na.rm

info (default = TRUE)

pad

info (default = 0)

verbose

boolean verbose flag (default = FALSE)

max.slice

integer Max slice of intervals to evaluate with gr.val (default = 1e3)

ff.chunk

integer Max chunk to evaluate with fftab (default = 1e6)

max.chunk

integer gr.findoverlaps parameter (default = 1e11)

out.path

out.path to save variable to (default = NULL)

covariates

list of lists where each internal list represents a covariate, the internal list can have elements: track, type,signature,name,pad,na.rm = na.rm,field,grep. See Cov_Arr class for descriptions of what each of these elements do. Note that track is equivalent to the 'Covariate' parameter in Cov_Arr

maxpatientpergene

Sets the maximum number of events a patient can contribute per target (default = Inf)

ptidcol

string Column where patient ID is stored

...

paths to sequence covariates whose output names will be their argument names, and each consists of a list with (default = FALSE) $track field corresponding to a GRanges, RleList, ffTrack object (or path to rds containing that object), $type which can have one of three values "numeric", "sequence", "interval". Numeric tracks must have $score field if they are GRanges), and can have a $na.rm logical field describing how to treat NA values (set to na.rm argument by default) Sequence covariates must be ffTrack objects (or paths to ffTrack rds) and require an additional variables $signatures, which will be used as input to fftab, and can have optional logical argument $grep to specify inexact matches (see fftab) fftab signature: signatures is a named list that specify what is to be tallied. Each signature (ie list element) consist of an arbitrary length character vector specifying strings to or length 1 character vector to grepl (if grep = TRUE) or a length 1 or 2 numeric vector specifying exact value or interval to match (for numeric data) Every list element of signature will become a metadata column in the output GRanges specifying how many positions in the given interval match the given query

Interval covariates must be Granges (or paths to GRanges rds) or paths to bed files

weightEvetns

boolean If TRUE, will weight events by their overlap with targets. e.g. if 10 region, that target region will get assigned a score of 0.1 for that event. If false, any overlap will be given a weight of 1.

Value

GRanges of input targets annotated with covariate statistics (+/- constrained to the subranges in optional argument covered)

Author(s)

Marcin Imielinski


mskilab/fish.hook documentation built on Feb. 20, 2018, 4:23 p.m.