rrscale: Re-scale a data matrix

Description Usage Arguments Value Examples

View source: R/rrscale.R

Description

This transformation is three steps (1) Gaussianize the data, (2) z-score Transform the data, and (3) remove extreme outliers from the data. The sequence of these transformations helps focus further analyses on consequential variance in the data rather than having it be focused on variation resulting from the feature's measurement scale or outliers.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
rrscale(
  Y,
  trans_list = list(box_cox_negative = box_cox_negative, asinh = asinh),
  lims_list = list(box_cox_negative = c(-100, 100), asinh = list(0, 100)),
  opt_control = NULL,
  opt_method = "DEoptim",
  z = 4,
  q = 0.001,
  verbose = FALSE,
  log_dir = ".rrscale/",
  zeros = FALSE,
  opts = FALSE,
  seed = NULL
)

Arguments

Y

Data matrix, data.frame, or list of vectors, to be transformed.

trans_list

List of transformations to be considered. See function list_transformations. Each element of the list should be a list containing the transformation function as the first element and the derivative of the transformation function as the second argument. The first argument of each function should be the data, the second the transformation parameter.

lims_list

List of optimization limits for each transformation from trans_list. This should be a list the same length as trans_list. Each element of the list is a two-element vector that sets the optimization limits for the parameter of each transformation family.

opt_control

Optional optimization controlling parameters for DEoptim control argument. See the DEoptim package for details.

opt_method

Which optimization method to use. Defaults to DEoptim. Other choice is nloptr.

z

The O-step cutoff value. Points are removed if their robust z-score is above z in magnitude.

q

The Z-step winsorizing quantile cutoff. The quantile at which to winsorize the data when calculating the robust z-scores.

verbose

a boolean, if TRUE then save optimization output in log_dir.

log_dir

directory for verbose output. Defaults to ".rrscale/"

zeros

How to deal with zeros in the data set. If set to FALSE the algorithm will fail if it encounters a zero. If set to a number or 'NA' then the zeros are replaced by this number or 'NA'.

opts

Boolean determining if optimization output is returned. Defaults to FALSE.

seed

Sets the seed before running any other analyses.

Value

A list of output:

Examples

1
2
3
Y <- rlnorm(10)%*%t(rlnorm(10))
rr.out <- rrscale(Y)
Yt <- rr.out$RR

rrscale documentation built on July 2, 2020, 2:15 a.m.