Description Usage Arguments Details Value
This function takes input data and (optionally) a table of sample metadata, and rearranges data into a matrix of features to be used for model building
1 2 3 4 5 6 7 8 9 | get_cluster_features(
tab,
predictors = NULL,
metadata.tab = NULL,
variable.var = "variable",
value.var = "value",
endpoint.grouping = NULL,
sample.col = "sample.id"
)
|
tab |
A |
predictors |
Columns in |
metadata.tab |
Optional. A |
variable.var |
The column in |
value.var |
The column in |
endpoint.grouping |
Columns in |
sample.col |
Optional, only used if |
The input table needs to be in molten format (i.e. see reshape2::melt
) with variable.var
and
value.var
columns identifying variables and their values (for instance cell population abundances). The
metadata.tab
, if provided, must contain a column (identified by the sample.col
function argument), which matches the names of the samples in
tab
(i.e. the part after the @
, "sample1" in the above example). The rest of the columns in metadata.tab
represent file-level
metadata, which is used to identify the data corresponding to a given combination of predictors (see below)
An example will help clarify the working of this function. Suppose you have collected data from multiple patients at multiple timepoints and under multiple
stimulation conditions.
In this case the metadata.tab
would look like this
sample.id
This is used to merge sample metadata with the input data (see above)
timepoint
The timepoint information
condition
The stimulation condition
subject
The subjet each file was derived from
Let's assume a few different scenarios.
You have subject level information (e.g. "responder" vs "non-responder") and you want to predict whether any combination of the timepoint
and
condition
information predicts this outcome. In this case you would call the function with predictors = c("condition", "timepoint")
and
endpoint.grouping = "sample"
. The features in the resulting output would look like cluster_1_feature1_condition_timepoint
You have subject and timepoint level information, and you want to see if any of the stimulation conditions predicts it. In this case you would call
the function with predictors = c("condition")
and endpoint.grouping = c("sample", "timepoint")
. The features in the resulting output
would look like cluster_1_feature1_condition
Internally this function uses reshape2::dcast
to structure the data in the appropriate format with the following formula (see the reshape2::dcast
documentation for details on how the formula is interpreted):
endpoint.grouping1 + endpoint.grouping2 + ... ~ variable.var + predictors1 + predictors2 + ...
Returns a matrix where each row corresponds to a combination of the levels of the variables specified in endpoint.grouping
, and the columns are
numeric features corresponding to combinations of the levels of the predictors
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.