mv_ag | R Documentation |
EMD can get very heavy with large datasets. For an example lemnatech dataset filtering for images from every 5th day there are 6332^2 = 40,094,224 pairwise EMD values. In long format that's a 40 million row dataframe, which is unwieldy. This function is to help reduce the size of datasets before comparing histograms and moving on with matrix methods or network analysis.
mv_ag(
df,
group,
mvCols = "frequencies",
n_per_group = 1,
outRows = NULL,
keep = NULL,
parallel = getOption("mc.cores", 1),
traitCol = "trait",
labelCol = "label",
valueCol = "value",
id = "image"
)
df |
A dataframe with multi value traits. This can be in wide or long format, data is assumed to be long if traitCol, valueCol, and labelCol are present. |
group |
Vector of column names for variables which uniquely identify groups in the data to summarize data over. Typically this would be the design variables and a time variable. |
mvCols |
Either a vector of column names/positions representing multi value traits or a character string that identifies the multi value trait columns as a regex pattern. Defaults to "frequencies". |
n_per_group |
Number of rows to return for each group. |
outRows |
Optionally this is a different way to specify how many rows to return. This will often not be exact so that groups have the same number of observations each. |
keep |
A vector of single value traits to also average over groups, if there are a mix of single and multi value traits in your data. |
parallel |
Optionally the groups can be run in parallel with this number of cores, defaults to 1 if the "mc.cores" option is not set globally. |
traitCol |
Column with phenotype names, defaults to "trait". |
labelCol |
Column with phenotype labels (units), defaults to "label". |
valueCol |
Column with phenotype values, defaults to "value". |
id |
Column that uniquely identifies images if the data is in long format. This is ignored when data is in wide format. |
Returns a dataframe summarized by the specified groups over the multi-value traits.
s1 <- mvSim(
dists = list(runif = list(min = 15, max = 150)),
n_samples = 10,
counts = 1000,
min_bin = 1,
max_bin = 180,
wide = TRUE
)
mv_ag(s1, group = "group", mvCols = "sim_", n_per_group = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.