makefastcmhdata: Create sample data for fastcmh

Description Usage Arguments See Also Examples

View source: R/makefastcmhdata.R

Description

This function creates sample data for use with the runfastcmh method.

Usage

1
2
3
4
5
makefastcmhdata(folder = "./", xfilename = "data.txt",
  yfilename = "label.txt", covfilename = "cov.txt", K = 2, L = 1000,
  n = 200, noiseP = 0.3, corruptP = 0.05, rho = 0.8, tau1 = 100,
  taulength1 = 4, tau2 = 200, taulength2 = 4, seednum = 2,
  truetaufilename = "truetau.txt", showOutput = FALSE, saveToList = FALSE)

Arguments

folder

The folder in which the data will be saved. Default is current directory "./".

xfilename

The name of the data file. Default is "data.txt"

yfilename

The name of the label file. Default is "label.txt"

covfilename

The name of the file containing the covariate categories . This file actually just contains K numbers, where K is the number of covariates. Default is "cov.txt"

K

The number of covariates (a positive integer). Default is K=2.

L

The number of features (length of each sequence). Default is L=1000.

n

The number of samples (cases and controls combined). Default is n=200, i.e. 100 cases and 100 controls.

noiseP

The background noise in the data (as a probability of 0/1 being flipped). Default is noiseP=0.3

corruptP

The probability of data corruption: each bit has probability corruptP of being flipped. Default is corruptP=0.05.

rho

The strength of the confounding in the confounded interval (as a probability). Default is rho=0.8 (i.e. a very strong signal).

tau1

The location of the significant interval (starting point). Default value is tau1=100.

taulength1

The length of the significant interval. Default value is taulength1=4, so default significant interval is [100, 103].

tau2

The location of the confounded significant interval (starting point). Default value is tau2=200.

taulength2

The length of the confounded significant interval. Default value is taulength2=4, so default significant interval is [200, 203].

seednum

The seed used for generating the data. Default value is seednum=2.

truetaufilename

The file where the location of the true significant intervals are saved (as opposed to the detected significant intervals). Default is "truetau.txt".

showOutput

Flag to decide whether or not to show output, where files are created, their names, etc. Default is FALSE, so will save to folder by default. However, all of the examples use saveToList=TRUE in order to avoid writing to file. The list will consist of data, label and cov data frames, when saveToList=TRUE.

saveToList

Flag to decide whether or not to save data to the folder, or to return (output) the data as a list. By default, saveToList=FALSE.

See Also

runfastcmh

Examples

1
2
3
4
5
6
#make a small sample data set, using the default parameters
mylist <- makefastcmhdata(showOutput=TRUE, saveToList=TRUE)

#make a very small sample data set
mylist <- makefastcmhdata(n=20, L=10, tau1=2, taulength1=2,
       tau2=6, taulength2=2, saveToList=TRUE)

fastcmh documentation built on May 2, 2019, 10:13 a.m.