df_estimate: Estimation of degrees of freedom (df) for an arbitrary...

Description Usage Arguments Details Value Examples

View source: R/randRot.R

Description

This function has been deprecated and will be defunct in the next release ! This function estimates the local degrees of freedom (df) of mapped data for an arbitrary mapping function. The estimation is done for a set of selected features.

Usage

1
2
3
4
5
6
7
df_estimate(
  data,
  features = sample(nrow(data), 10),
  mapping,
  ...,
  delta = sqrt(.Machine$double.eps)
)

Arguments

data

A numerical data matrix.

features

Features for which the df should be estimated (default sample(nrow(data),10)).

mapping

A mapping function that takes a matrix features x samples dimensions as first argument and returns a matrix of mapped data with the same dimensions. Any further arguments can be passed to mapping through ....

...

Additional arguments passed to mapping.

delta

A numeric delta for the finite differences (default sqrt(.Machine$double.eps)).

Details

The df are estimated as the rank of the local Jacobian matrix. It is thus the rank of the local linear approximation of the mapping function, where linearisation is performed around data. The Jacobian matrix J for a certain feature j is calculated with finite differences:

data2 <- data

data2[j,i] <- data2[j,i] + delta

J[,i] = (mapping(data2, ...) - mapping(data,...))/delta

In the current implementation, the rank of J is calculated as sum of the singular values of J. So for each feature, a SVD decomposition of the ncol(data) x ncol(data) matrix J is calculated.

This function should be considered experimental due to the common numerical issues associated with finite differences and numerical calculation of matrix ranks. So always check results for plausibility.

An estimation of df is generated for each feature specified in features.

Value

A named numeric vector of estimated df for each feature. Names correspond to features.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# The limma::removeBatchEffect function is a commonly used function for batch effect correction:
mapping <- function(Y, batch, mod) {
  limma::removeBatchEffect(x = Y, batch = batch, design = mod)
}

#The following 2 lines were commented out, as df_estimate() is deprecated.
#dfs <- df_estimate(edata, features = 1, mapping = mapping, batch = pdata$batch, mod = mod1)
#dfs

randRotation documentation built on April 14, 2021, 6:01 p.m.