synData: Decision table generator.

Description Usage Arguments Value Author(s) Examples

View source: R/synData.R

Description

Creates a decision table of correlated features.

Usage

1
2
3
synData(nFeatures=c(10,5,3,2,2), rf=c(0.2,0.2,0.2,0.2,0.2), rd=c(0.4,0.5,0.6,0.7,0.8),
nObjects=120, nOutcome=2,distribution="uniform", unbalanced=F, pUnbalancedClass=0.8, 
discrete=F, levels=4, labels=c("A","C","G","T"), binProb=0.5, seed=1)

Arguments

nFeatures

A numeric vector of features proportions. The default is c(10,5,3,2,2).

rf

A numeric vector of correlations within feature set.

rd

A numeric vector of correlations between each feature and decision.

nObjects

A numeric value of objects number. The default is 120.

nOutcome

A numeric value of number of decision classes. The default is 2.

distribution

A character value of the name of the distribution. For discrete data choose betwen "uniform" and "binomial". For non-discrete data choose between "uniform" or "normal". The default is "uniform".

unbalanced

Logical. Set TRUE to generate unbalanced data. Default is FALSE.

pUnbalancedClass

A numeric value of number of unbalanced proportion for the first class. The default is 0.8.

discrete

Logical. Set TRUE to generate discrete data. Default is FALSE.

levels

A numeric value of discretization levels. The default is 4.

labels

A character vector of discretization labels for levels of discretization.

binProb

A numeric value of probability for binomial distribution. The default is 0.5.

seed

A numeric value of seed. The default is 1.

Value

output

A data frame with a decision table.

Author(s)

Mateusz Garbulowski

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
library(R.ROSETTA)

### continuous data ###

## weak correlation
df1 <- synData(nFeatures=c(5,5,3,2,2), rf=c(0.2,0.3,0.2,0.4,0.4), rd=c(0.2,0.3,0.4,0.3,0.4))
out1 <- rosetta(df1)
out1$quality ## accuracy = 60%

## medium correlation
df2 <- synData(nFeatures=c(5,5,3,2,2), rf=c(0.2,0.3,0.2,0.4,0.4), rd=c(0.4,0.4,0.6,0.6,0.7))
out2 <- rosetta(df2)
out2$quality ## accuracy = 75%

## strong correlation
df3 <- synData(nFeatures=c(5,5,3,2,2), rf=c(0.2,0.3,0.2,0.4,0.4), rd=c(0.5,0.7,0.7,0.8,0.8))
out3 <- rosetta(df3)
out3$quality ## accuracy = 90%

### discrete data ###

dfd <- synData(nFeatures=c(5,5,3,2,2), rf=c(0.2,0.3,0.2,0.4,0.4), 
               rd=c(0.2,0.3,0.4,0.5,0.6), discrete = T, levels = 3, labels = c("low", "medium", "high"))
outd <- rosetta(dfd, discrete = T)
outd$quality ## accuracy = 85%

mategarb/R.ROSETTA documentation built on April 2, 2021, 12:28 a.m.