Description Usage Arguments Details Value See Also Examples
View source: R/dataset.subsample.R
datasource.subsample
picks randomly the specified amount of
samples from the original datasource and also adds noise to the
subsampled dataset if it is specified.
1 2 3 | datasource.subsample(datasource,experiments=NA,datasets.num=5,
local.noise=20,global.noise=0,noiseType="normal",
samplevar=TRUE)
|
datasource |
data.frame where columns contain variables and rows contain experiments. |
experiments |
Integer specifying the number of experiments that for performing the subsampling of datasources (default: NA). |
datasets.num |
Integer specifying the number of datasets to be generated for each of the selected original datasources (default: 5). |
local.noise |
Integer specifying the desired percentage of local noise to be added at each of the subsampled datasets (default: 20). |
global.noise |
Integer specifying the desired percentage of global noise to be added at each of the subsampled datasets (default: 0). |
noiseType |
Character specifying the type of the noise to be added: "normal" (default: "normal"). |
samplevar |
(default: TRUE). |
If the argument experiments
is NA, the value
experiments
will be calculated automatically in order to
have datasets.num
smaller datasets that does not have the
same experiment twice inside each dataset.
Each of the subsampled datasets experiments
would have a
number of experiments around experiments
\pm 20 \%
that would be chosen randomly among the original the original
number of experiments without replacement.
If the argument experiments
is a number, the number of
datasets.num
is calculated automatically.
If the number of specified experiments
is greater or equal
than the original number of experiments, then only a replicate
will be generated and the subsampled dataset would have the same
dimensions as the original one but the experiments will be
unsorted randomly.
Two different types of noises could be added, that are specified
with the argument noiseType
:
"local": the variance of the noise is different for each variable and it is the percentage specified of the variance of each variable ( \pm 20 \% ).
"Globlal": the variance of the noise is the same for the whole datasource, it is the percentage specified of the mean variance of all the variables ( \pm 20 \% ).
datasource.subsample
returns a list with
datasets.num
elements, each one of objects contains a data.frame of the
subsampled dataset with the amount of Gaussian noise
specified that would contain the same number of variables.
1 2 3 4 5 6 7 | # Subsample
data.list.1 <- datasource.subsample(syntren300.data)
data.list.2 <- datasource.subsample(syntren300.data,
local.noise=10)
# Inference
inf.net.1 <- cor(data.list.1[[1]])
inf.net.2 <- cor(data.list.2[[4]])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.