Nothing
#' Sorting functions for synoptic tables
#'
#' @description
#' This function sorts synoptic tables from \code{\link{syntable}} function output. Sorting criteria
#' can be either numerical values in synoptic tables, such as group-wise frequencies or fidelity
#' measures, as well as combined criteria that also take into account differential character (according to
#' the criteria defined by Tsiripidis et al., 2009).
#'
#' The algorithm aims to sort species to blocked structure considering the defined criteria and input
#' tables, with the best characterizing species on the top of the block, followed by species with
#' descending importance for plant community description.
#'
#' @param syn1 Input synoptic table 1, a data frame with numerical data format, usually from
#' \code{\link{syntable}} function output. See Details for input table format.
#' The values of this table will be displayed in the final output table.
#' @param syn2 Optional second input table with additional numeric or differential character
#' sorting criteria.
#' @param matrix Optional species-sample matrix, already used for \code{\link{syntable}} function input; used only when calculating the
#' sorted species-sample matrix (`samples = TRUE`). Site names are imported from row names.
#' @param groups Group identities for samples (identical to \code{\link{syntable}} function input). A vector/factor of length
#' \code{nrow(matrix)} (one group per sample/row). Ensure matching order of
#' group identity and samples in matrix for correct allocation of group numbers to samples.
#' @param group_order Optional vector giving the desired order of group levels.
#' If provided, only groups listed here are included in the output.
#' @param method Sorting algorithm and synoptic table output options
#' (\code{method = c("allspec", "alldiff", "totalfreq", "manual")}).
#' @param manual_order Optional character vector of species names (matching
#' row names of \code{syn1}) to impose a manual row order when
#' \code{method = "manual"}. Species not listed are appended and ordered by
#' their overall frequency (row sums of \code{syn1}, descending).
#' @param min1 Group-wise threshold minimum value for species shown in the final sorted synoptic table.
#' Species below that minimum will be listed in the output (\code{$others} section).
#' @param min2 Threshold minimum value for considering species values of a numerical second input table \code{syn2}.
#' Species below that minimum will not be displayed in final synoptic table, but will be listed in the
#' output (\code{$others} section).
#' @param samples Logical; if \code{TRUE}, a sorted species-sample matrix is returned. Defaults to \code{FALSE}.
#'
#' @section Details:
#' Different types of sorted synoptic tables can be created with this function. Methods \code{"allspec"} and \code{"alldiff"} are based upon
#' the calculation of group-wise frequencies. They basically build species blocks left-to-right, for one group at a time (\emph{1..k}):
#' For group \emph{i = 1}, all species whose row maximum equals the value in column \emph{i} are taken and sorted in descending order within that block.
#' This is repeated for all subsequent groups, appending species step-by-step. For groups \emph{2..k}, duplicates that were already bound to a previous group are dropped.
#'
#' \itemize{
#' \item \code{method = "allspec"} (\emph{default}): Computes a sorted synoptic table based on one or
#' two numeric input tables, e.g. percentage or absolute frequencies, or phi fidelity values.
#' When specifying a second table with \code{syn2}, the threshold minimum value defined in \code{min2}
#' will be applied for considering species values of \code{syn1} for sorting.
#'
#' \item \code{method = "alldiff"}: Includes differential speciescharacter as sorting criteria.
#' Table \code{syn1} must be numeric (e.g. percentage frequency) and
#' \code{syn2} must contain information on differential character (output from \code{\link{syntable}}
#' function with defined \code{type = "diffspec"}). The result table shows ALL diagnostic and
#' non-diagnostic species, as long as they match the \code{min1} and \code{min2} thresholds.
#' The algorithm detects highest group values of species calculated from
#' \code{syn1} as base for sorting, but will consider differential character criterion
#' from \code{syn2} as well. Species with high values in \code{syn1} AND
#' positive differential character will then be listed on the top of a species block.
#' Within such a block, the differentiating and high-abundant species are sorted in a way favoring
#' species that are positive in only one or at least few groups.
#'
#' \item \code{method = "totalfreq"}: Sorts species by their overall
#' frequency in \code{syn1} (row sums) in descending order. Only species with
#' at least one group value \eqn{\ge} \code{min1} are kept in the table; the
#' rest are listed in \code{$others}.
#'
#' \item \code{method = "manual"}: The rows are ordered by \code{manual_order}
#' (species not present are ignored). Any remaining species are appended,
#' sorted by their overall frequency (row sums of \code{syn1}, descending).
#' Species must still pass \code{min1} (max across groups \eqn{\ge} \code{min1});
#' the rest go to \code{$others}.
#' }
#'
#' @return
#' Returns an (invisible) list composed of:
#' \itemize{
#' \item \code{$output} Sorting method description
#' \item \code{$species} Information to species included in the output table
#' \item \code{$samplesize} Sample sizes in groups
#' \item \code{$syntable} Sorted synoptic table, with the numeric values of \code{syn1} in the left-side columns
#' and differential character of species on the right-side of the output table. See Tsiripidis et al. (2009) for
#' details and criteria for the assignment of a differential species as p = positive, n = negative,
#' pn = positive/negative.
#' \item \code{$others} Species that are omitted in Synoptic table due to their failing
#' reaching the given threshold values \code{min1} and \code{min2}. Sorted alphabetically.
#' \item \code{$samples} Sorted original species-sample matrix, with original Plot-IDs (as column
#' names) and the group identity (Group_No as first row of output samples table) (only when `samples = TRUE`)
#' \item \code{$omitted_groups} Names of groups removed because they were not listed in `group_order`
#' }
#'
#'
#' @references
#' Bruelheide, H. (2000): A new measure of fidelity and its application to defining species groups.
#' \emph{Journal of Vegetation Science} \strong{11}: 167-178. \doi{https://doi.org/10.2307/3236796}
#'
#' Chytry, M., Tichy, L., Holt, J., Botta-Dukat, Z. (2002): Determination of diagnostic species with
#' statistical fidelity measures. \emph{Journal of Vegetation Science} \strong{13}: 79-90. \doi{https://doi.org/10.1111/j.1654-1103.2002.tb02025.x}
#'
#' Sokal, R.R. & Rohlf, F.J. (1995): Biometry. 3rd edition Freemann, New York.
#'
#' Tsiripidis, I., Bergmeier, E., Fotiadis, G. & Dimopoulos, P. (2009): A new algorithm for the
#' determination of differential taxa. \emph{Journal of Vegetation Science} \strong{20}: 233-240. \doi{https://doi.org/10.1111/j.1654-1103.2009.05273.x}
#'
#' @author Jenny Schellenberg, Friedemann von Lampe
#' @seealso \code{\link{syntable}}
#' @examples
#' ### Synoptic table of Scheden vegetation data using syntable()-function:
#' # classification to create a vector of group identity
#' library(cluster)
#' pam1 <- pam(schedenveg, 4)
#'
#'
#' ### One input table for sorting:
#' ## Synoptic table with percentage frequency of species in clusters, all species
#' unordered <- syntable(schedenveg, pam1$clustering, abund = "percentage",
#' type = "percfreq") # Unordered synoptic percentage frequency table
#' sorted <- synsort(syn1 = unordered$syntable, matrix = schedenveg,
#' groups = pam1$clustering, method = "allspec", min1 = 0,
#' samples = TRUE)
#' sorted # view results
#' \dontrun{
#' # Export sorted synoptic table
#' write.csv(sorted$syntab, "syntab.csv")
#' # Export sorted species-sample matrix with original releve data for postprocessing
#' write.csv(sorted$samples, "output_species_sample.csv")}
#'
#' ## Synoptic table with only phi values
#' phi <- syntable(schedenveg, pam1$clustering, abund = "percentage",
#' type = "phi") # calculates cluster-wise phi for each species
#' phi_table <- synsort(syn1 = phi$syntable, matrix = schedenveg, groups = pam1$clustering,
#' method = "allspec", min1 = 0.3, samples = TRUE)
#' phi_table # view results
#'
#' ## Synoptic table with total frequency (global ranking)
#' total <- synsort(syn1 = unordered$syntable,
#' groups = pam1$clustering,
#' method = "totalfreq",
#' min1 = 5)
#' total # view results
#'
#' ### Two numerical tables for sorting:
#' ## Synoptic table showing percentage frequencies, but only for species with minimum phi-value
#' ## of 0.3 AND exclude species with less than 25% percentage frequency
#'
#' unordered <- syntable(schedenveg, pam1$clustering, abund = "percentage",
#' type = "percfreq") # Unordered synoptic percentage frequency table
#' phitable <- syntable(schedenveg, pam1$clustering, abund = "percentage",
#' type = "phi") # calculates cluster-wise phi for each species
#' # now sorting and arranging
#' phi_complete <- synsort(syn1 = unordered$syntable, syn2 = phitable$syntable,
#' matrix = schedenveg, groups = pam1$clustering, method = "allspec",
#' min1 = 25, min2 = 0.3, samples = TRUE)
#' phi_complete # view results
#'
#' ### Differential species analysis
#' differential <- syntable(schedenveg, pam1$clustering, abund = "percentage",
#' type = "diffspec")
#'
#' ## Synoptic table with percentage frequency (only species >25%) and
#' ## differential character.
#' complete <- synsort(syn1 = unordered$syntable, syn2 = differential$syntable,
#' matrix = schedenveg, groups = pam1$clustering,
#' method = "alldiff", min1 = 25, samples = TRUE)
#' complete # view result table
#' differential$differentials # list differential species for groups
#'
#'
#' @export
#' @importFrom Hmisc %nin%
synsort <- function(syn1, syn2 = syn1, matrix = NULL, groups, group_order = NULL,
method = "allspec", min1 = 0, min2 = 0,
samples = FALSE, manual_order = NULL) {
groups <- as.factor(groups)
original_levels <- levels(groups)
if (is.null(group_order)) {
group_order <- original_levels
} else {
if (!all(group_order %in% original_levels))
stop("group_order must contain existing levels from groups")
}
omitted_groups <- setdiff(original_levels, group_order)
#if (length(omitted_groups))
# message("Omitted groups: ", paste(omitted_groups, collapse = ", "))
keep <- groups %in% group_order
groups <- factor(groups[keep], levels = group_order)
if (!is.null(matrix))
matrix <- matrix[keep, , drop = FALSE]
if (!all(group_order %in% colnames(syn1)))
stop("group_order must match column names of syn1")
syn1 <- syn1[, group_order, drop = FALSE]
syn2 <- syn2[, group_order, drop = FALSE]
names(groups) <- row.names(matrix) # Name groups according to site names (if present)
group <- group_order
if (method == "allspec") {
if (all(syn2 == syn1)) {
frames <- list()
all <- syn1[apply(syn1, 1, max) >= min1,]
for (i in 1:length(group)) {
frames[[i]] <- assign(paste0("frame",i), all[apply(all,1,max) == all[,i],])
frames[[i]] <- frames[[i]][sort.list(frames[[i]][,i], decreasing=TRUE),] }
for ( i in 2:length(group)) {
duprows <- rownames(frames[[i]]) %in% rownames(frames[[1]])
frames[[1]] <- rbind(frames[[1]], frames[[i]][!duprows,]) }
allspec <- frames[[1]]
if (samples) {
if (is.null(matrix))
stop("matrix must be provided when samples = TRUE")
specsam <- data.frame(matrix(NA, nrow = nrow(allspec), ncol = nrow(matrix)))
if (length(groups) != nrow(matrix))
stop("groups must have the same length as samples in matrix")
rownames(specsam) <- rownames(allspec)
matrixplotnames <- names(sort(groups))
names(groups) <- seq(1, length(groups), 1)
colnames(specsam) <- names(sort(groups))
for (k in 1:nrow(allspec))
for (i in 1:nrow(specsam)) {{
if(rownames(specsam)[i] == rownames(allspec)[k]) {specsam[i,] <-
matrix[, names(matrix) == rownames(specsam)[i]][as.numeric(names(sort(groups)))]}
else {}
}}
names(specsam) <- matrixplotnames
Group_No <- sort(groups); Group_No <- paste0(Group_No, "L"); specsam <- rbind(Group_No = Group_No, specsam)
specsam[1,] <- substr(specsam[1,],1,1)
} else {
specsam <- data.frame()
}
results <- list("output" = "Synoptic table sorted by numerical values of one input table",
"species" = paste0("species with minimum value =", min1, " in input table 1, others listet below"),
"samplesize" = tapply(rep(1,length(groups)),groups,sum),
"syntable" = allspec,
"others" = if (length(sort(rownames(syn1[apply(syn1,1,max) < min1,]))) == 0)
{"No species excluded from Synoptic table."
} else {sort(rownames(syn1[apply(syn1,1,max) < min1,]))},
"samples" = specsam,
"omitted_groups" = omitted_groups)
} else {
if (is.numeric(unlist(syn2)) == TRUE)
{ all <- syn1[rowSums(syn2) >= min2,]
all <- syn1[apply(syn1,1,max) >= min1,]
} else {stop("check data format for syn2: must be numeric")}
frames <- list()
for ( i in 1:length(group)) {
frames[[i]] <- assign(paste0("frame",i), all[apply(all,1,max) == all[,i],])
frames[[i]] <- frames[[i]][sort.list(frames[[i]][,i], decreasing=TRUE),] }
for ( i in 2:length(group)) {
duprows <- rownames(frames[[i]]) %in% rownames(frames[[1]])
frames[[1]] <- rbind(frames[[1]], frames[[i]][!duprows,]) }
allspec <- frames[[1]]
if (samples) {
specsam <- data.frame(matrix(NA, nrow = nrow(allspec), ncol = nrow(matrix)))
rownames(specsam) <- rownames(allspec)
matrixplotnames <- names(sort(groups))
names(groups) <- seq(1, length(groups), 1)
colnames(specsam) <- names(sort(groups))
for (k in 1:nrow(allspec))
for (i in 1:nrow(specsam)) {{
if(rownames(specsam)[i] == rownames(allspec)[k]) {specsam[i,] <-
matrix[, names(matrix) ==
rownames(specsam)[i]][as.numeric(names(sort(groups)))]
} else {}
}}
names(specsam) <- matrixplotnames
Group_No <- sort(groups); Group_No <- paste0(Group_No, "L"); specsam <- rbind(Group_No = Group_No, specsam)
specsam[1,] <- substr(specsam[1,],1,1)
} else {
specsam <- data.frame()
}
results <- list("output" = "synoptic table sorted by values of two numerical input tables",
"species" = paste0("species with minimum value = ", min1,
" in input table 1 AND with minimum value =", min2,
" in input table 2, others listet below"),
"samplesize" = tapply(rep(1,length(groups)),groups,sum),
"syntable" = allspec,
"others" = if (length(sort(rownames(syn1[apply(syn1,1,max) < min1,]))) == 0)
{"No species excluded from Synoptic table."
} else {sort(rownames(syn1[apply(syn1,1,max) < min1,]))},
"samples" = specsam,
"omitted_groups" = omitted_groups)
}
} else if (method == "totalfreq") {
# keep species that reach min1 in any group (same semantics as "allspec")
allspec <- syn1[apply(syn1, 1, max) >= min1, , drop = FALSE]
# order by overall frequency (row-wise sum across groups), descending
ord <- order(rowSums(allspec, na.rm = TRUE), decreasing = TRUE)
allspec <- allspec[ord, , drop = FALSE]
# optional samples matrix
if (samples) {
if (is.null(matrix))
stop("matrix must be provided when samples = TRUE")
if (length(groups) != nrow(matrix))
stop("groups must have the same length as samples in matrix")
specsam <- data.frame(matrix(NA, nrow = nrow(allspec), ncol = nrow(matrix)))
rownames(specsam) <- rownames(allspec)
matrixplotnames <- names(sort(groups))
names(groups) <- seq(1, length(groups), 1)
colnames(specsam) <- names(sort(groups))
for (k in 1:nrow(allspec))
for (i in 1:nrow(specsam)) {{
if (rownames(specsam)[i] == rownames(allspec)[k]) {
specsam[i, ] <- matrix[, names(matrix) == rownames(specsam)[i]][as.numeric(names(sort(groups)))]
} else {}
}}
names(specsam) <- matrixplotnames
Group_No <- sort(groups); Group_No <- paste0(Group_No, "L")
specsam <- rbind(Group_No = Group_No, specsam)
specsam[1, ] <- substr(specsam[1, ], 1, 1)
} else {
specsam <- data.frame()
}
results <- list(
"output" = "Synoptic table sorted by total (row-wise) frequency of input table 1",
"species" = paste0("species with minimum value = ", min1,
" in at least one group; ordered by row sums of syn1 (descending)"),
"samplesize" = tapply(rep(1, length(groups)), groups, sum),
"syntable" = allspec,
"others" = if (length(sort(rownames(syn1[apply(syn1, 1, max) < min1, , drop = FALSE]))) == 0) {
"No species excluded from Synoptic table."
} else {
sort(rownames(syn1[apply(syn1, 1, max) < min1, , drop = FALSE]))
},
"samples" = specsam,
"omitted_groups" = omitted_groups
)
} else if (method == "manual") {
if (is.null(manual_order))
stop("manual_order must be provided when method = 'manual'")
# keep species that reach min1 in any groups
allspec <- syn1[apply(syn1, 1, max) >= min1, , drop = FALSE]
# sanitize / evaluate the manual list
wanted <- unique(as.character(manual_order))
present_manual <- wanted[wanted %in% rownames(allspec)]
missing_manual <- setdiff(wanted, rownames(allspec))
if (length(missing_manual))
warning("manual_order species not found (or filtered by min1) and will be ignored: ",
paste(missing_manual, collapse = ", "))
# remainder: species not specified manually
remainder <- setdiff(rownames(allspec), present_manual)
if (length(remainder)) {
rem_ord <- order(rowSums(allspec[remainder, , drop = FALSE], na.rm = TRUE),
decreasing = TRUE)
remainder <- remainder[rem_ord]
}
# final row order: manual first (in given order), then remainder by total freq
final_rows <- c(present_manual, remainder)
allspec <- allspec[final_rows, , drop = FALSE]
# optional samples matrix
if (samples) {
if (is.null(matrix))
stop("matrix must be provided when samples = TRUE")
if (length(groups) != nrow(matrix))
stop("groups must have the same length as samples in matrix")
specsam <- data.frame(matrix(NA, nrow = nrow(allspec), ncol = nrow(matrix)))
rownames(specsam) <- rownames(allspec)
matrixplotnames <- names(sort(groups))
names(groups) <- seq(1, length(groups), 1)
colnames(specsam) <- names(sort(groups))
for (k in 1:nrow(allspec))
for (i in 1:nrow(specsam)) {{
if (rownames(specsam)[i] == rownames(allspec)[k]) {
specsam[i, ] <- matrix[, names(matrix) == rownames(specsam)[i]][as.numeric(names(sort(groups)))]
} else {}
}}
names(specsam) <- matrixplotnames
Group_No <- sort(groups); Group_No <- paste0(Group_No, "L")
specsam <- rbind(Group_No = Group_No, specsam)
specsam[1, ] <- substr(specsam[1, ], 1, 1)
} else {
specsam <- data.frame()
}
results <- list(
"output" = "Synoptic table sorted by manual species order (then total frequency)",
"species" = paste0("species with minimum value = ", min1,
"; manual order applied first; remaining species by row sums (descending)"),
"samplesize" = tapply(rep(1, length(groups)), groups, sum),
"syntable" = allspec,
"others" = if (length(sort(rownames(syn1[apply(syn1, 1, max) < min1, , drop = FALSE]))) == 0) {
"No species excluded from Synoptic table."
} else {
sort(rownames(syn1[apply(syn1, 1, max) < min1, , drop = FALSE]))
},
"samples" = specsam,
"omitted_groups" = omitted_groups
)
} else if (method == "alldiff") {
# setup complete table
syntab <- syn2[apply(syn1,1,max) >= min1,]
syntab <- syntab[complete.cases(syntab),]
all <- syn1[apply(syn1,1,max) >= min1,]
all <- all[complete.cases(all),]
completetable <- merge(all, syntab, all.x=TRUE, by= "row.names", sort=F)
rownames(completetable) <- completetable[,1]
completetable <- completetable[,-1]
name <- c("")
for(i in 1:length(group)) {
name[i] = paste0("perc ", sort(unique(groups))[i])
name[i+length(unique(groups))] = paste0("diff ", sort(unique(groups))[i]) }
names(completetable) <- name
frames <- list()
for (i in 1:length(group)) {
frames[[i]] <- assign(paste0("frame",i), completetable[apply(
completetable[,1:length(group)],1,max) == completetable[,i],])
frames[[i]] <- frames[[i]][sort.list(frames[[i]][,i], decreasing=TRUE),] }
for ( i in 2:length(group)) {
duprows <- rownames(frames[[i]]) %in% rownames(frames[[1]])
frames[[1]] <- rbind(frames[[1]], frames[[i]][!duprows,]) }
allspec <- frames[[1]]
if (samples) {
specsam <- data.frame(matrix(NA, nrow = nrow(allspec), ncol = nrow(matrix)))
rownames(specsam) <- rownames(allspec)
matrixplotnames <- names(sort(groups))
names(groups) <- seq(1, length(groups), 1)
colnames(specsam) <- names(sort(groups))
for (k in 1:nrow(allspec))
for (i in 1:nrow(specsam)) {{
if(rownames(specsam)[i] == rownames(allspec)[k]) {specsam[i,] <-
matrix[, names(matrix) == rownames(specsam)[i]][as.numeric(names(sort(groups)))]}
else {}
}}
names(specsam) <- matrixplotnames
Group_No <- sort(groups); Group_No <- paste0(Group_No, "L"); specsam <- rbind(Group_No = Group_No, specsam)
specsam[1,] <- substr(specsam[1,],1,1)
} else {
specsam <- data.frame()
}
results <- list("output" = "complete synoptic table, sorted by values of numeric input table and differential species character",
"species" = paste0("species with minimum value of", sep=" ", min1,
" and their differentiating character"),
"samplesize" = tapply(rep(1,length(groups)),groups,sum),
"syntable" = allspec,
"others" = if (length(sort(rownames(syn1[apply(syn1,1,max) < min1,]))) == 0)
{"No species excluded from Synoptic table."
} else {sort(rownames(syn1[apply(syn1,1,max) < min1,]))},
"samples" = specsam,
"omitted_groups" = omitted_groups)
} else {stop("Sorting of synoptic table failed: wrong method entry. Check correct formula input")}
return(invisible(results))
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.