View source: R/normalizeData.R
normalizeData | R Documentation |
normalizeData
is used to transform a data collection into a normalized
form suitable for GFA.
This function does two things: 1. It centers each variable. GFA assumes
zero-mean data, as it models variances. 2. It normalizes the scales of
variables and/or variable groups. Features with higher variance will
affect the model structure more; if this is not desired, the normalization
should be done. In GFA it is additionally possible to normalize the
importance of variable groups (data sources), in addition or instead of
individual variables. Finally, the total variance of data is normalized for
numerical reasons. This is particularly important if no other normalization
is done. NOTE: the function assumes continuous-valued data.
If some features are e.g. binary with only a small portion of 1s, we do not
recommend centering them.
normalizeData(train, test = NULL, type = "scaleOverAll")
train |
a training data set. For a detailed description, see parameter Y
in |
test |
a test dataset. Should be provided if sequential prediction is used later. |
type |
Specifies the type of normalization to do. Mean-centering of the features is performed in all the cases, and option "center" does not perform any scaling. Option "scaleOverall" (default) uses a single parameter to scale the variance of the whole data collection to 1, while "scaleSources" scales each data source to have variance 1. Finally, "scaleFeatures" performs z-normalization, i.e. assigns the variance of each feature to 1. |
A list containing the following elements:
train |
Normalized training data. |
test |
Normalized test data for sequential prediction (if provided as input). |
trainMean |
Feature-wise means of the training data sources. |
trainSd |
Feature-wise/overall standard deviations of the training data sources. |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.