knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5, warning = FALSE, message = FALSE ) library(SQIpro)
Soil quality is defined as "the capacity of a specific kind of soil to function within natural or managed ecosystem boundaries, to sustain plant and animal productivity, maintain or enhance water and air quality, and support human health and habitation" [@doran1994].
The Soil Quality Index (SQI) provides a single numeric score (0–1)
that integrates multiple soil physical, chemical, and biological
indicators. SQIpro implements six established methods for computing SQI,
a flexible variable scoring framework, and publication-quality
visualisation tools — all in a single CRAN-compliant R package.
| Term | Definition | |------|-----------| | Minimum Data Set (MDS) | The smallest subset of soil variables that adequately characterises soil quality [@andrews2004] | | Scoring function | A transformation that converts raw variable values to a 0–1 score | | SQI | Weighted or unweighted combination of variable scores | | Optimum | The value or range of a variable associated with best soil function |
# From CRAN (once published) install.packages("SQIpro") # Development version from GitHub # install.packages("remotes") remotes::install_github("yourname/SQIpro")
SQIpro ships with soil_data, a hypothetical dataset of 100 soil
samples across 5 land-use systems and 2 depths, containing 12 soil
indicators.
data(soil_data) dim(soil_data) head(soil_data)
table(soil_data$LandUse, soil_data$Depth)
Always validate before analysis:
result <- validate_data( soil_data, group_cols = c("LandUse", "Depth") ) result$n_per_group
This is the most important step. Each variable receives a scoring type:
"more" — higher is better (e.g., organic carbon, microbial biomass)"less" — lower is better (e.g., bulk density, electrical conductivity)"opt" — a specific optimum range (e.g., pH 6.0–7.0)"trap" — trapezoidal with explicit zero boundariescfg <- make_config( variable = c("pH", "EC", "BD", "OC", "MBC", "PMN", "Clay", "WHC", "DEH", "AP", "TN"), type = c("opt", "less", "less", "more", "more", "more", "opt", "more", "more", "more","more"), opt_low = c(6.0, NA, NA, NA, NA, NA, 20, NA, NA, NA, NA), opt_high = c(7.0, NA, NA, NA, NA, NA, 35, NA, NA, NA, NA), description = c( "Soil pH (H2O 1:2.5)", "Electrical Conductivity (dS/m)", "Bulk Density (g/cm3)", "Organic Carbon (%)", "Microbial Biomass Carbon (mg/kg)", "Potentially Mineralizable N (mg/kg)", "Clay content (%)", "Water Holding Capacity (%)", "Dehydrogenase Activity (ug TPF/g/day)", "Available Phosphorus (mg/kg)", "Total Nitrogen (%)" ) ) print(cfg)
Before proceeding, always visualise your scoring curves to confirm biological plausibility:
plot_scoring_curves(soil_data, cfg, group_cols = c("LandUse", "Depth"))
scored <- score_all(soil_data, cfg, group_cols = c("LandUse", "Depth")) head(scored[, c("LandUse", "Depth", "pH", "OC", "MBC")])
mds <- select_mds(scored, group_cols = c("LandUse", "Depth"), load_threshold = 0.6) mds$mds_vars
Equal-weight additive index [@doran1994]:
res_lin <- sqi_linear(scored, cfg, group_cols = c("LandUse", "Depth"), mds_vars = mds$mds_vars) print(res_lin)
Variables weighted by stepwise regression coefficients [@masto2008]:
res_reg <- sqi_regression(scored, cfg, dep_var = "OC", group_cols = c("LandUse", "Depth"), mds_vars = mds$mds_vars) print(res_reg)
Variables weighted by variance explained [@andrews2004; @bastida2008]:
res_pca <- sqi_pca(scored, cfg, group_cols = c("LandUse", "Depth"), mds = mds) print(res_pca)
Arithmetic or geometric mean of fuzzy membership scores:
res_fuz <- sqi_fuzzy(scored, cfg, group_cols = c("LandUse", "Depth"), mds_vars = mds$mds_vars, operator = "mean") print(res_fuz)
Objective weights derived from Shannon entropy [@shannon1948]:
res_ent <- sqi_entropy(scored, cfg, group_cols = c("LandUse", "Depth"), mds_vars = mds$mds_vars) print(res_ent) cat("\nEntropy weights:\n") print(attr(res_ent, "entropy_weights"))
Multi-criteria ranking by ideal solution distance [@hwang1981]:
res_top <- sqi_topsis(scored, cfg, group_cols = c("LandUse", "Depth"), mds_vars = mds$mds_vars) print(res_top)
comparison <- sqi_compare(scored, cfg, group_cols = c("LandUse", "Depth"), dep_var = "OC", mds = mds) print(comparison)
plot_scores(scored, cfg, group_cols = c("LandUse", "Depth"), group_by = "LandUse", facet_by = "Depth")
plot_sqi(res_lin, sqi_col = "SQI_linear", group_col = "LandUse", fill_col = "Depth")
# Requires the 'fmsb' package: install.packages("fmsb") plot_radar(scored, cfg, group_col = "LandUse", group_cols = c("LandUse", "Depth"))
plot_pca_biplot(mds, scored, group_col = "LandUse")
# Compute per-observation index scored$SQI_obs <- rowMeans(scored[, mds$mds_vars], na.rm = TRUE) aov_res <- sqi_anova(scored, sqi_col = "SQI_obs", group_col = "LandUse") cat("ANOVA significant:", aov_res$significant, "\n") print(aov_res$compact_letters)
sa <- sqi_sensitivity(scored, cfg, group_cols = c("LandUse", "Depth"), method = "linear", mds_vars = mds$mds_vars) print(sa)
plot_sensitivity(sa)
raw data
│
▼
validate_data() ← check structure, missing values, groups
│
▼
make_config() ← define scoring type per variable
│
▼
plot_scoring_curves() ← verify biological plausibility
│
▼
score_all() ← transform all variables to 0–1
│
▼
select_mds() ← PCA-based minimum data set selection
│
├──► sqi_linear()
├──► sqi_regression()
├──► sqi_pca()
├──► sqi_fuzzy()
├──► sqi_entropy()
└──► sqi_topsis()
│
▼
sqi_compare() ← unified results table + ranking
│
▼
Visualisation ← plot_sqi(), plot_scores(), plot_radar()
│
▼
Statistics ← sqi_anova(), sqi_sensitivity()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.