View source: R/metric_stats2.R
metric.stats2 | R Documentation |
This function calculates secondary statistics (DE and z-score) on metric statistics for use with developing a multi-metric index.
metric.stats2(
data_metval,
data_metstat,
col_metval_RefStatus = "RefStatus",
col_metval_DataType = "DataType",
col_metval_Subset = "INDEX_CLASS",
col_metstat_RefStatus = "RefStatus",
col_metstat_DataType = "DataType",
col_metstat_Subset = "INDEX_CLASS",
RefStatus_Ref = "Ref",
RefStatus_Str = "Str",
RefStatus_Oth = "Oth",
DataType_Cal = "Cal",
DataType_Ver = "Ver",
Subset_Value = NULL
)
data_metval |
Data frame of metric values. |
data_metstat |
Data frame of metric statistics |
col_metval_RefStatus |
Column name for Reference Status. Default = "Ref_Status" |
col_metval_DataType |
Column name for Data Type – Validation vs. Calibration. Default = "Data_Type" |
col_metval_Subset |
Column name for INDEX_CLASS in data_metstats. Default = INDEX_CLASS |
col_metstat_RefStatus |
Column name for Reference Status. Default = "Ref_Status" |
col_metstat_DataType |
Column name for Data Type – Validation vs. Calibration. Default = "Data_Type" |
col_metstat_Subset |
Column name for INDEX_CLASS in data_metstats. Default = xx. |
RefStatus_Ref |
RefStatus value for Reference. Default = "Ref" |
RefStatus_Str |
RefStatus value for Stressed. Default = "Str" |
RefStatus_Oth |
RefStatus value for Other. Default = "Oth" |
DataType_Cal |
DataType value for Calibration. Default = "Cal" |
DataType_Ver |
DataType value for Verification. Default = "Ver" |
Subset_Value |
Subset value of INDEX_CLASS (site class). Default = NULL |
Secondary metrics statistics for the data are calculated.
Inputs are metric values and metric stats outputs.
Metric values is a wide format with columns for each metric. Assumes only a single Subset.
Metrics stats is a wide format with columns for each statistic with metrics in a single column. Assumes only a single Subset.
Required fields are RefStatus, DataType, and INDEX_CLASS. The user is allowed to enter their own values for these fields for each input file.
The two statistics calculated are z-score and discrimination efficiency (DE) for each metric within each DataType (cal / val).
Z-scores are calculated using the calibration (or development) data set for a given INDEX_CLASS (or Site Class).
* (mean Ref - mean Str) / sd Ref
DE is calculated without knowing the expected direction of response for each metric for a given INDEX_CLASS (or Site Class). DE is the percentage (0-100) of **stressed** samples that fall **below** the **25th** quantile (for decreaser metrics, e.g., total taxa) or **above** the **75th** quantile (for increaser metrics, e.g., HBI) of the **reference** samples.
A data frame of the metric.stats input is returned with new columns (z_score, DE25 and DE75). The z-score is added for each Ref_Status. DE25 and DE75 are only added where Ref_Status is labeled as Stressed.
A data frame of the metric.stats input is returned with new columns (z_score, DE25 and DE75).
# data, benthos
df_bugs <- data_mmi_dev
# Munge Names
names(df_bugs)[names(df_bugs) %in% "BenSampID"] <- "SAMPLEID"
names(df_bugs)[names(df_bugs) %in% "TaxaID"] <- "TAXAID"
names(df_bugs)[names(df_bugs) %in% "Individuals"] <- "N_TAXA"
names(df_bugs)[names(df_bugs) %in% "Exclude"] <- "EXCLUDE"
names(df_bugs)[names(df_bugs) %in% "Class"] <- "INDEX_CLASS"
names(df_bugs)[names(df_bugs) %in% "Unique_ID"] <- "SITEID"
# Calc Metrics
cols_keep <- c("Ref_v1", "CalVal_Class4", "SITEID", "CollDate", "CollMeth")
# INDEX_NAME and INDEX_CLASS kept by default
df_metval <- metric.values(df_bugs, "bugs", fun.cols2keep = cols_keep)
# Calc Stats
col_metrics <- names(df_metval)[9:ncol(df_metval)]
col_SampID <- "SAMPLEID"
col_RefStatus <- "REF_V1"
RefStatus_Ref <- "Ref"
RefStatus_Str <- "Strs"
RefStatus_Oth <- "Other"
col_DataType <- "CALVAL_CLASS4"
DataType_Cal <- "cal"
DataType_Ver <- "verif"
col_Subset <- "INDEX_CLASS"
Subset_Value <- "CENTRALHILLS"
df_stats <- metric.stats(df_metval
, col_metrics
, col_SampID
, col_RefStatus
, RefStatus_Ref
, RefStatus_Str
, RefStatus_Oth
, col_DataType
, DataType_Cal
, DataType_Ver
, col_Subset
, Subset_Value)
# Calc Stats2 (z-scores and DE)
data_metval <- df_metval
data_metstat <- df_stats
col_metval_RefStatus <- "REF_V1"
col_metval_DataType <- "CALVAL_CLASS4"
col_metval_Subset <- "INDEX_CLASS"
col_metstat_RefStatus <- "REF_V1"
col_metstat_DataType <- "CALVAL_CLASS4"
col_metstat_Subset <- "INDEX_CLASS"
RefStatus_Ref = "Ref"
RefStatus_Str = "Strs"
RefStatus_Oth = "Other"
DataType_Cal = "cal"
DataType_Ver = "verif"
Subset_Value = "CENTRALHILLS"
df_stats2 <- metric.stats2(data_metval
, data_metstat
, col_metval_RefStatus
, col_metval_DataType
, col_metval_Subset
, col_metstat_RefStatus
, col_metstat_DataType
, col_metstat_Subset
, RefStatus_Ref
, RefStatus_Str
, RefStatus_Oth
, DataType_Cal
, DataType_Ver
, Subset_Value)
## Not run:
# Save Results
write.table(df_stats2
, file.path(tempdir(), "metric.stats2.tsv")
, col.names = TRUE
, row.names = FALSE
, sep = "\t")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.