metric.stats | R Documentation |
This function calculates metric statistics for use with developing a multi-metric index.
Inputs are a data frame with
metric.stats(
fun.DF,
col_metrics,
col_SampID = "SAMPLEID",
col_RefStatus = "Ref_Status",
RefStatus_Ref = "Ref",
RefStatus_Str = "Str",
RefStatus_Oth = "Oth",
col_DataType = "Data_Type",
DataType_Cal = "Cal",
DataType_Ver = "Ver",
col_Subset = NULL,
Subset_Value = NULL
)
fun.DF |
Data frame. |
col_metrics |
Column names for metrics. |
col_SampID |
Column name for unique sample identifier. Default = "SAMPLEID". |
col_RefStatus |
Column name for Reference Status. Default = "Ref_Status" |
RefStatus_Ref |
Reference Status name for Reference used in col_ RefStatus. Default = “Ref”. Use NULL if you don't use this value. |
RefStatus_Str |
Reference Status name for Stressed used in col_ RefStatus. Default = “Str”. Use NULL if you don't use this value. |
RefStatus_Oth |
Reference Status name for Other used in col_ RefStatus. Default = “Oth”. Use NULL if you don't use this value. |
col_DataType |
Column name for Data Type – Validation vs. Calibration. Default = "Data_Type" |
DataType_Cal |
Datatype name for Calibration used in col_DataType. Default = “Cal”. Use NULL if you don't use this value. |
DataType_Ver |
Datatype name for Verification used in col_DataType. Default = “Ver”. Use NULL if you don't use this value. |
col_Subset |
Column name to subset the data and run on each subset. Default = NULL. If NULL then no subset will be generated. |
Subset_Value |
Subset name to be used for creating subset. Default = NULL. |
Summary statistics for the data are calculated.
The data is filtered by the column Subset for only a single value given by the user. If need further subsets re-run the function. If no subset is given the entire data set is used.
Statistics will be generated for up to 6 combinations for RefStatus (Ref, Oth, Str) and DataType (Cal, Ver).
The resulting dataframe will have the statistics in columns with the first 4 columns as: INDEX_CLASS (if col_Subset not provided), col_RefStatus, col_DataType, and Metric_Name.
The following statistics are generated with na.rm = TRUE.
* n = number
* min = minimum
* max = maximum
* mean = mean
* median = median
* range = range (max - min)
* sd = standard deviation
* cv = coefficient of variation (sd/mean)
* q05 = quantile, 5
* q10 = quantile, 10
* q25 = quantile, 25
* q50 = quantile, 50
* q75 = quantile, 75
* q90 = quantile, 90
* q95 = quantile, 95
data frame of metrics (rows) and statistics (columns). This is in long format with columns for INDEX_CLASS, RefStatus, and DataType.
# data, benthos
df_bugs <- data_mmi_dev
# Munge Names
names(df_bugs)[names(df_bugs) %in% "BenSampID"] <- "SAMPLEID"
names(df_bugs)[names(df_bugs) %in% "TaxaID"] <- "TAXAID"
names(df_bugs)[names(df_bugs) %in% "Individuals"] <- "N_TAXA"
names(df_bugs)[names(df_bugs) %in% "Exclude"] <- "EXCLUDE"
names(df_bugs)[names(df_bugs) %in% "Class"] <- "INDEX_CLASS"
names(df_bugs)[names(df_bugs) %in% "Unique_ID"] <- "SITEID"
# Calc Metrics
cols_keep <- c("Ref_v1", "CalVal_Class4", "SITEID", "CollDate", "CollMeth")
# INDEX_NAME and INDEX_CLASS kept by default
df_metval <- metric.values(df_bugs, "bugs", fun.cols2keep = cols_keep)
# Calc Stats
col_metrics <- names(df_metval)[9:ncol(df_metval)]
col_SampID <- "SAMPLEID"
col_RefStatus <- "REF_V1"
RefStatus_Ref <- "Ref"
RefStatus_Str <- "Strs"
RefStatus_Oth <- "Other"
col_DataType <- "CALVAL_CLASS4"
DataType_Cal <- "cal"
DataType_Ver <- "verif"
col_Subset <- "INDEX_CLASS"
Subset_Value <- "CENTRALHILLS"
df_stats <- metric.stats(df_metval
, col_metrics
, col_SampID
, col_RefStatus
, RefStatus_Ref
, RefStatus_Str
, RefStatus_Oth
, col_DataType
, DataType_Cal
, DataType_Ver
, col_Subset
, Subset_Value)
## Not run:
# Save Results
write.table(df_stats
, file.path(tempdir(), "metric.stats.tsv")
, col.names = TRUE
, row.names = FALSE
, sep = "\t")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.