jammacalc: Calculate MA-plot data
In jmw86069/jamma: MA-plots for omics data

jammacalc

R Documentation

Calculate MA-plot data

Description

Calculate MA-plot data

Usage

jammacalc(
  x,
  na.rm = TRUE,
  controlSamples = NULL,
  centerGroups = NULL,
  controlFloor = NA,
  naControlAction = c("row", "floor", "min", "na"),
  naControlFloor = 0,
  groupedX = TRUE,
  useMedian = TRUE,
  useMean = NULL,
  whichSamples = NULL,
  noise_floor = -Inf,
  noise_floor_value = noise_floor,
  naValue = NA,
  mad_row_min = 0,
  grouped_mad = TRUE,
  centerFunc = centerGeneData,
  useRank = FALSE,
  returnType = c("ma_list", "tidy"),
  verbose = FALSE,
  ...
)

Arguments

`x`	`numeric` matrix typically containing log-normal measurements, with measurement rows, and sample columns.
`na.rm`	`logical` indicating whether to ignore NA values during `numeric` summary functions.
`controlSamples`	`character` vector containing values in `colnames(x)` to define control samples used during centering. These values are passed to `centerGeneData()`.
`centerGroups`	`character` vector with length equal to `ncol(x)` which defines the group for each column in `x`. Data will be centered within each group.
`groupedX`	`logical` indicating how to calculate the x-axis value when `centerGroups` contains multiple groups. When groupedX=TRUE, the mean of each group median is used, which has the effect of representing each group equally. When groupedX=FALSE, the median across all columns is used, which can have the effect of preferring sample groups with a larger number of columns.
`useMedian`	`logical` indicating whether to use the median values when calculating the x-axis and during data centering. The median naturally reduces the effect of outlier points on the resulting MA-plots., when compared to using the mean. When useMedian=FALSE, the mean value is used.
`useMean`	(deprecated) `logical` indicating whether to use the mean instead of the median value. This argument is being removed in order to improve consistency with other Jam package functions.
`whichSamples`	`character` vector containing `colnames(x)`, or integer vector referencing column numbers in `x`. This argument specifies which columns to return, but does not change the columns used to define the group centering values. For example, the group medians are calculated using all the data, but only the samples in `whichSamples` are centered to produce MA-plot data.
`noise_floor`	`numeric` value indicating the minimum numeric value allowed in the input matrix `x`. When `NULL` or `-Inf` no noise floor is applied. It is common to set `noise_floor=0` to limit MA-plot data to use values zero and above.
`noise_floor_value`	single `numeric` value used to replace `numeric` values at or below `noise_floor` when `noise_floor` is not NULL. By default, `noise_floor_value=noise_floor` which means values at or below the noise floor are set to the floor. Another useful option is `noise_floor_value=NA` which has the effect of removing the point from the MA-plot altogether. This option is recommended for sparse data matrices where the presence of values at or below zero are indicative of missing data (zero-inflated data) and does not automatically reflect an actual value of zero.
`naValue`	single `numeric` value used to replace any `NA` values in the input matrix `x`. This argument can be useful to replace `NA` values with something like zero.
`mad_row_min`	`numeric` value defining the minimum group value, corresponding to the x-axis position on the MA-plot, required for a row to be included in the MAD calculation. This threshold is useful to filter outlier data below a noise threshold, so that the MAD calculation will include only the data above that value. For example, with count data, it is useful to filter out counts below roughly 8, where Poisson noise is a more dominant component than real count data. Remember that count data should already be log2-transformed, so the threshold should also be identically transformed, for example using `log2(1 + 8)` to set a minimum count threshold of at least 8.
`grouped_mad`	`logical` indicating whether the MAD value should be calculated per group when `centerGroups` is supplied, from which the MAD factor values are derived. When `TRUE` it has the effect of highlighting outliers within each group using the variability in that group. When `FALSE` the overall MAD is calculated, and a particularly high variability group may have all its group members labeled with a high MAD factor.
`centerFunc`	`function` used for centering data, by default one of the functions `centerGeneData()` or `centerGeneData_v1()`. This argument will be removed in the near future and is mainly intended to allow testing the two centering functions. The following arguments are passed to this function: x: the input `numeric` data matrix na.rm: `logical` whether to ignore NA value. Always use `na.rm=TRUE`. controlSamples: `character` optional subset of `colnames(x)` to use as reference controls during centering centerGroups: `character` vector of groups for `colnames(x)` controlFloor: `numeric` optional minimum allowed value for control summary prior to centering naControlAction: `character` string for how to handle entirely NA control groups during centering naControlFloor: `numeric` used when `naControlAction="floor"` and all control values are `NA`. One `numeric` value is inserted into the control group. useMedian: `logical` whether to use median (TRUE) or mean (FALSE) returnGroups: `logical` whether to return summary of group assignment in attribute `"center_df"` returnGroupedValues: `logical` whether to return group summary values in attribute `"x_group"` ...: other arguments are passed along via `...`.
`returnType`	`character` string indicating the format of data to return: `"ma_list"` is a list of MA-plot two-column numeric matrices with colnames `c("x","y")`; "tidy" returns a tall `data.frame` suitable for use in ggplot2.
`verbose`	`logical` indicating whether to print verbose output.
`...`	additional arguments are ignored.

Details

This function takes a numeric matrix as input, and calculates data sufficient to produce MA-plots. The default output is a list of two-column numeric matrices with "x" and "y" coordinates, representing the group median and difference from median, respectively.

The mean value can be used by setting useMedian=FALSE.

Samples can be grouped using the argument centerGroups. In this case the y-axis value will be "difference from group median."

Control samples can be specified for centering using the argument controlSamples. In this case, the y-axis value will be "difference from control median".

The sample grouping, and control samples can be combined, in which case the y-axis values will be "difference from the control median within the centering group."

jmw86069/jamma
MA-plots for omics data

jammacalc: Calculate MA-plot data
In jmw86069/jamma: MA-plots for omics data

Calculate MA-plot data

Description

Usage

Arguments

Details

See Also

Related to jammacalc in jmw86069/jamma...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamma MA-plots for omics data

jammacalc: Calculate MA-plot data In jmw86069/jamma: MA-plots for omics data

Calculate MA-plot data

Description

Usage

Arguments

Details

See Also

Related to jammacalc in jmw86069/jamma...

R Package Documentation

Browse R Packages

We want your feedback!

jmw86069/jamma
MA-plots for omics data

jammacalc: Calculate MA-plot data
In jmw86069/jamma: MA-plots for omics data