topWindowStats: stats for the top windows in each region

View source: R/topWindow.R

topWindowStatsR Documentation

stats for the top windows in each region

Description

given window resutls and normalized counts, combine significant overlapping windows into regions and for each region, pick two candidate winodws:

  1. with highest log2FoldChange and

  2. with highest normalized mean in treatment samples (see parameter treatmentCols)

Return a data.frame with region information and stats, and for the selected windows, the following information:

  • unique_id of the window

  • start and end co-ordinates

  • log2FoldChange

  • normalized mean expression in treatment and control samples and

  • individual normalized expression in replicates

Usage

topWindowStats(
  windowRes,
  padjCol = "padj",
  padjThresh = 0.05,
  log2FoldChangeCol = "log2FoldChange",
  log2FoldChangeThresh = 1,
  start0based = TRUE,
  normalizedCounts,
  treatmentCols,
  treatmentName = "treatment",
  controlName = "control",
  op = "max"
)

Arguments

windowRes

data.frame, output from resultsDEWSeq

padjCol

character, name of the adjusted pvalue column (default: padj)

padjThresh

numeric, threshold for p-adjusted value (default: 0.05)

log2FoldChangeCol

character, name of the log2foldchange column (default: log2FoldChange)

log2FoldChangeThresh

numeric, threshold for log2foldchange value (default:1)

start0based

logical, TRUE (default) or FALSE. If TRUE, then the start positions in windowRes is considered to be 0-based

normalizedCounts

data.frame or matrix, normalized read counts per window. rownames(normalizedCounts) and unique_id column from windoeRes must match see counts, vst or rlog

treatmentCols

character vector, column names in normalizedCounts for treatment/case samples. The remaining columns in the data.frame will be considered control samples

treatmentName

character, treatment name, see Details (default: treatment)

controlName

character, control name, see Details (default: control)

op

character, can be one of max (default) or min. max returns windows with maximum log2FoldChange and mean normalized expression in the treatmentCols columns, min returns windows with minimum log2FoldChange and mean normalized expression

Details

The output data.frame of this function has the following columns:

  • chromosome: chromosome name

  • gene_id: gene id

  • gene_name: gene name

  • gene_region: gene region

  • gene_type: gene type annotation

  • regionStartId: unique_id of the left most window, where a enriched region begins

  • region_begin: start position of the enriched region

  • region_end: end position of the enriched region

  • region_length: length of the enrched region

  • strand: strand info

  • Nr_of_region: number of the current region

  • Total_nr_of_region: total number of regions

  • log2FoldChange_min: min. log 2 fold change in the region

  • log2FoldChange_mean: average log 2 fold change in the region

  • log2FoldChange_max: max. log 2 fold change in the region

  • unique_id.log2FCWindow: unique_id of the window with largest log2FoldChange

  • begin.log2FCWindow: start position of the window with largest log2FoldChange

  • end.log2FCWindow: end of the window with largest log2FoldChange

  • log2FoldChange.log2FCWindow: log2FoldChange of the window with largest log2FoldChange

  • treatmentName.mean.log2FCWindow: mean of the normalized expression of the treatment samples for log2FCWindow, names in treatmentCols are used to calculate mean and treatmentName is from the parameter treatmentName

  • controlName.mean.log2FCWindow: mean of the normalized expression of the control samples for log2FCWindow, colnames(normalizedCounts) not found in treatmentCols are used to calculate mean and controlName is from the parameter controlName

  • the next columns will be normalized expression values of the log2FCWindow from individual treatment and control samples.

  • unique_id.meanWindow: unique_id of the window with largest mean in all treatment samples from treatmentCols

  • begin.meanWindow: start position of the mean window

  • end.meanWindow: end position of the mean window

  • log2FoldChange.meanWindow:log2FoldChange of the mean window

  • treatmentName.mean.meanWindow: mean of the normalized expression of the treatment samples for meanWindow, names in treatmentCols are used to calculate mean and treatmentName is from the parameter treatmentName

  • controlName.mean.meanWindow: mean of the normalized expression of the control samples for log2FCWindow, colnames(normalizedCounts) not found in treatmentCols are used to calculate mean and controlName is from the parameter controlName

  • the next columns will be normalized expression values of the meanWindow from individual treatment and control samples

Value

data.frame

Examples


data(slbpWindows)
data(slbpVst)
slbpList <- topWindowStats(slbpWindows,padjCol = 'pSlidingWindows.adj',
normalizedCounts = slbpVst, treatmentCols = c('IP1','IP2'),
treatmentName = 'SLBP',controlName = 'SMI')


EMBL-Hentze-group/DEWSeq documentation built on Oct. 17, 2023, 10:41 p.m.