create.discr.matrix: Discriminatory Multivariate Data Generator

Description Usage Arguments Value Author(s) References Examples

View source: R/data.generation.R

Description

Generates a matrix of dimensions dim(U) with induced correlations. D variables are randomly selected as discriminatory. If num.groups = 2 then discrimination is induced by adding and subtracting values derived from the level of of discrimination, l, for the classes respectively. Multi-class datasets have a few further levels of randomization. For each variable, a random number of the groups are selected as discriminating while the remaining groups are not altered. For each discriminatory group, a unique change is provided by randomly assigning addition or subtraction of the discrimination factor. For example, if 3 groups are selected and two groups are assigned as addition and the third subtraction, the second addition is multiplied by its number of replicates. E.g. (1,1,-1) -> (1,2,-1). These values are randomized and then multiplied by the respective discrimination factor. The resulting values are then added/subtracted from the respective groups. A noise matrix is applied to the final matrix to perturb 'perfect' discrimination.

Usage

1
create.discr.matrix(V, D = 20, l = 1.5, num.groups = 2, k = 4)

Arguments

V

Numeric matrix

D

Number of discriminatory variables induced. Default D = 20

l

Level of discrimination, higher = greater separation. Default l = 1.5

num.groups

Number of groups in the dataset

k

Correlation Perturbation - The higher k, the more the data is perturbed. Default k = 4

Value

List of the following elements

discr.mat

Matrix of dimension dim(V)+1 with discriminatory variables induced and the .classes added to the end of the matrix.

features

Vector of features that were induced to be discriminatory.

Author(s)

Charles E. Determan Jr.

References

Wongravee, K., Lloyd, G R., Hall, J., Holmboe, M. E., & Schaefer, M. L. (2009). Monte-Carlo methods for determining optimal number of significant variables. Application to mouse urinary profiles. Metabolomics, 5(4), 387-406. http://dx.doi.org/10.1007/s11306-009-0164-4

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# Create Multivariate Matrices

# Random Multivariate Matrix

# 50 variables, 100 samples, 1 standard devation, 0.2 noise factor

rand.mat <- create.random.matrix(nvar = 50, 
                                 nsamp = 100, 
                                 st.dev = 1, 
                                 perturb = 0.2)


# Induce correlations in a numeric matrix

# Default settings
# minimum and maximum block sizes (min.block.size = 2, max.block.size = 5)
# default correlation purturbation (k=4)
# see ?create.corr.matrix for citation for methods

corr.mat <- create.corr.matrix(rand.mat)


# Induce Discriminatory Variables

# 10 discriminatory variables (D = 10)
# default discrimination level (l = 1.5)
# default number of groups (num.groups=2)
# default correlation purturbation (k = 4)

dat.discr <- create.discr.matrix(corr.mat, D=10)

Example output

solo last variable

OmicsMarkeR documentation built on April 28, 2020, 6:54 p.m.