regionFinder: Break up nucleotide level signal into candidate regions and...
In cshukla/oligoGames: Analyze data from massively parallel reporter assays

Description Usage Arguments Value

View source: R/DRfinder.R

This is an internal workhorse function for bumphunt that takes the nucleotide-level signal and parses it into contigous regions that pass the threshold and form the candidates, and then scores each one based on a test statistic of the difference.

regionFinder(x, chr, pos, cluster = NULL, ind = seq(along = x),
  order = TRUE, minNumRegion = 5, maxGap, cutoff = quantile(abs(x), 0.99),
  assumeSorted = FALSE, oligo.mat = oligo.mat, verbose = TRUE,
  design = design, workers = workers, logT = TRUE, naive = FALSE,
  beta = NULL)

`x`	a vector of condition coefficients (for the covariate of interest) for each nucletide
`chr`	a character vector of labels for region-level characteristics, with length equal to the number of rows in `oligo.mat` (and in the same order). This can indicate the chromosome, gene, lncRNA, etc.
`pos`	a numeric vector of basepair positions for each nucleotide in `oligo.mat` (and in the same order).
`cluster`	a vector of cluster membership values for each nucleotide determined by the `clusterMaker` function in the `bumphunter` package
`ind`	a vector if indices of `x` which are non-NULL. Defaults to all indices of `x`.
`order`	logical that indicates whether or not to order the candidate regions by the test statistic magnitude (largest to smallest). Defaults to TRUE.
`minNumRegion`	positive integer that represents the minimum number of nucleotides to consider for a candidate region. Default value is 5.
`maxGap`	positive integer that indicates the maximum number of basepairs that can separate two nucleotides before they will be divided into two separate candidate regions. Defaults to 50.
`cutoff`	scalar value that represents the absolute value (or a vector of two numbers representing a lower and upper bound) for the cutoff of the single nucleotide condition coefficient that is used to discover candidate regions.
`assumeSorted`	logical that indicates whether the nucleotides are sorted in ascending order. Defaults to FALSE.
`oligo.mat`	a matrix that contains the nucleotide level counts that has one row per nucleotide and one column per sample.
`verbose`	logical value that indicates whether addtional progress messages within each iteration should be printed to stout. Default value is FALSE.
`design`	a model matrix with one row per sample and one column per independent covariate.
`workers`	positive integer that represents the number of cores to use if parallelization is desired of the smoothing step.
`logT`	logical value that indicates whether to model the log2 transformed signal (plus a pseudocount of 1). Default is TRUE. Only set to false if transformation has been done prior to running this function, or if distribution of raw values looks relatively symmetric.
`naive`	a logical value indicating whether to use naive region-level statistic in step 2 that simply takes average of statistic in step 1 across the region, instead of the default, which calculates a new statistic that jointly considers all loci in the region. Also, in step 1 the standard deviation among replicates is not considered.
`beta`	vector of loci-specific statistics from step 1 (only needed if naive is TRUE)

a data.frame that contains the results of region detection. The data.frame contains one row for each candidate region, and 7 columns, in the following order: 1. chr = region level labels such as chromosome, gene, or lncRNA, 2. start = start basepair position of the region, 3. end = end basepair position of the region, 4. indexStart = the index of the region's starting nucleotide, 5. indexEnd = the index of the region's ending nucleotide, 6. length = the number of nucleotides contained in the region, and 7. stat = the test statistic for the condition difference.

cshukla/oligoGames documentation built on May 27, 2019, 8:44 a.m.