twinspan: (Modified) TWINSPAN in R
In zdealveindy/twinspanR: TWo-way INdicator SPecies ANalysis (and its modified version) in R

Description Usage Arguments Details Value Author(s) References See Also Examples

Calculates TWINSPAN (TWo-way INdicator SPecies ANalaysis, Hill 1979) and its modified version according to Rolecek et al. (2009)

twinspan(
  com,
  modif = F,
  cut.levels = c(0, 2, 5, 10, 20),
  min.group.size = 5,
  levels = 6,
  clusters = 5,
  diss = "bray",
  min.diss = NULL,
  mean.median = "mean",
  show.output.on.console = FALSE,
  quiet = TRUE,
  ...
)

## S3 method for class 'tw'
summary(object, ...)

`com`	Community data (`data.frame` or `matrix`).
`modif`	Should the modified TWINSPAN algorithm be used? (logical, value, default = FALSE, i.e. standard TWINSPAN)
`cut.levels`	Pseudospecies cut levels (default = `c(0,2,5,10,20)`). Should not exceed 9 cut levels.
`min.group.size`	Minimum size of the group, which should not be further divided (default = 5).
`levels`	Number of hierarchical levels of divisions (default = 6, should be between 0 and 15). Applies only for standard TWINSPAN (`modif = FALSE`).
`clusters`	Number of clusters generated by modified TWINSPAN (default = 5). Applies only for modified TWINSPAN (`modif = TRUE`).
`diss`	Dissimilarity (default = 'bray') used to calculate cluster heterogeneity for modified TWINSPAN (`modif = TRUE`). Available options are: `'total.inertia'` for total inertia measured by correspondence analysis; `'whittaker'` for Whittaker's multiplicative measure of beta diversity; `'bray'`, `'jaccard'` and some other (see Details) for pairwise measure of betadiversity; `'multi.jaccard'` and `'multi.sorensen'` for multi-site measures of beta diversity (sensu Baselga et al. 2007). Details for more information. Applies only for modified TWINSPAN (`modif = TRUE`).
`min.diss`	Minimum dissimilarity under which the cluster will not be divided further (default = NULL, which means that the stopping rule is based on number of clusters (parameter `clusters`)). Currently not implemented.
`mean.median`	Should be the average dissimilarity of cluster calculated as mean or median of all between sample dissimilarities within the cluster? (default = `'mean'`, alternative is `'median'`)
`show.output.on.console`	Logical; should the function communicating with `twinspan.exe` show the output (rather long) of TWINSPAN program on console? Default = `FALSE`. Argument passsed via function `shell` into `system`.
`quiet`	Logical; should the function reading TWINSPAN output files (tw.TWI and tw.PUN) be quiet and not report on console number of items it has read? Default = `TRUE`, means the function is quiet. Argument passed into function `scan`.
`...`	Other (rarely used) TWINSPAN parameters passed into function `create.tw.dat` (see the help file of this function for complete list of modifiable arguments).
`object`	Object of the class `'tw'`.

The function twinspan calculates TWINSPAN classification algorithm introduced by Hill (1979) and alternatively also modified TWINSPAN algorithm introduced by Rolecek et al. (2009). It generates object of the class tw, with generic print function printing results, summary for overview of parameters and cut defining which sample should be classified into which cluster.

Default values for arguments used in twinspan function (e.g. definition of pseudospecies cut levels, number of hierarchical divisions etc.) are the same as the default values of the original TWINSPAN program (Hill 1979) and also WinTWINS (Hill & Smilauer 2005).

When calculating modified TWINSPAN (modif = TRUE), one needs to choose number of target clusters (argument cluster) instead of hierarchical levels of divisions as in standard TWINSPAN (argument levels), and also the measure of dissimilarity (diss) to calculate heterogeneity of clusters at each step of division (the most heterogeneous one is divided in the next step). Several groups of beta diversity measures are currently implemented:

'total.inertia' - total inertia, calculated by correspondence analysis (cca function from vegan) and applied on original quantitative species data (abundances);
'whittaker' - Whittaker's beta diversity, calculated as gamma/mean(alpha)-1 (Whittaker 1960), applied on species data transformed into presences-absences;
'manhattan', 'euclidean', 'canberra', 'bray', 'kulczynski', 'jaccard', 'gower', 'altGower', 'morisita', 'horn', 'mountford', 'raup', 'binomial', 'chao', 'cao' or 'mahalanobis' - mean of beta diversities calculated among pairs of samples - argument is passed into argument diss in vegdist function of vegan, applied on original quantitative species data (abundances);
'multi.sorensen' or 'multi.jaccard' - multi-site beta diversity, calculated from group of sites according to Baselga et al. (2007) and using function beta.multi from library betapart.

If the row names in community matrix (com) contain spaces, these names will be transformed into syntactically valid names using function make.names (syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number).

Arguments show.output.on.console and quiet regulates how "verbal" will be twinspan function while running. Default setting (show.output.on.console = FALSE, quiet = TRUE) supress all the output. Setting quiet = FALSE has only minor effect - it reports how many items have been read in each step of analysis from the output files (tw.TWI and tw.PUN) using function scan (the argument quiet is directly passed into this function). In contrary setting show.output.on.console = TRUE prints complete output generated by twinspan.exe program on console. Argument show.output.on.console has similar behavior as the argument of the same name in function system, but the value is not directly passed to this function. Output could be captured using function capture.output from package utils - see examples below.

twinspan returns object of the class 'tw', which is a list with the following items:

classif data frame with three columns: order - sequential number of plot, plot.no - original number of plot (row.names in community matrix com), class - binomial code with hieararchical result of TWINSPAN classification.
twi vector (if modif = FALSE) or list (if modif = TRUE) of complete machine-readable output of TWINSPAN algorithm read from *.TWI file. In case of modified TWINSPAN (modif = TRUE) it is a list with number of items equals to number of clusters.
spnames data frame with two columns: full.name - full scientific species name (names (com)), abbrev.name - eight-digits abbreviation created by make.cepnames function from vegan.
modif logical; was the result calculated using standard TWINSPAN (modif = FALSE) or its modified version (modif = TRUE)?

Mark O. Hill wrote the original Fortran code, which has been compiled by Stephan M. Hennekens into twinspan.exe to be used within his application MEGATAB (which was, in the past, part of Turboveg for Windows; Hennekens & Schaminee 2001). This version of twinspan.exe was later used also in JUICE program (Tichy 2002) and fixed by Petr Smilauer for issues related to order instability. The twinspanR package was written by David Zeleny (zeleny.david@gmail.com); it is basically an R wrapper around twinspan.exe program maintaining the communication between twinspan.exe and R, with some added functionality (e.g. implementing the algorithm of modified TWINSPAN by Rolecek et al. 2009).

Baselga A., Jimenez-Valverde A. & Niccolini G. (2007): A multiple-site similarity measure independent of richness. Biology Letters, 3: 642-645.
Hennekens S.M. & Schaminee J.H.J. (2001): TURBOVEG, a comprehensive data base management system for vegetation data. Journal of Vegetation Science, 12: 589-591.
Hill M.O. (1979): TWINSPAN - A FORTRAN program for arranging multivariate data in an ordered two-way table by classification of the individuals and attributes. Section of Ecology and Systematics, Cornel University, Ithaca, New York.
Hill M.O. & Smilauer P. (2005): TWINSPAN for Windows version 2.3. Centre for Ecology and Hydrology & University of South Bohemia, Huntingdon & Ceske Budejovice.
Rolecek J., Tichy L., Zeleny D. & Chytry M. (2009): Modified TWINSPAN classification in which the hierarchy respects cluster heterogeneity. Journal of Vegetation Science, 20: 596-602.
Tichy L. (2002): JUICE, software for vegetation classification. Journal of Vegetation Science, 13: 451-453.
Whittaker R.H. (1960): Vegetation of the Siskiyou mountains, Oregon and California. Ecological Monographs, 30:279-338.

create.tw.dat, cut.tw, print.tw.

## Modified TWINSPAN on traditional Ellenberg's Danube meadow dataset, projected on DCA 
## and compared with original classification into three vegetation types made by tabular sorting:
library (twinspanR)
library (vegan)
data (danube)
res <- twinspan (danube$spe, modif = TRUE, clusters = 4)
k <- cut (res)
dca <- decorana (danube$spe)
par (mfrow = c(1,2))
ordiplot (dca, type = 'n', display = 'si', main = 'Modified TWINSPAN')
points (dca, col = k)
for (i in c(1,2,4)) ordihull (dca, groups = k, show.group = i, col = i,
 draw = 'polygon', label = TRUE)
ordiplot (dca, type = 'n', display = 'si', main = 'Original assignment\n (Ellenberg 1954)')
points (dca, col = danube$env$veg.type)
for (i in c(1:3)) ordihull (dca, groups = danube$env$veg.type, 
 show.group = unique (danube$env$veg.type)[i], col = i,
 draw = 'polygon', label = TRUE)

## To capture the console output of twinspan.exe into R object, use the following:
## Not run: 
out <- capture.output (tw <- twinspan (danube$spe, show.output.on.console = T))
summary (tw)           # returns summary of twinspan algorithm
cat (out, sep = '\n')  # prints the captured output
write.table (out, file = 'out.txt', quot = F, row.names = F) # writes output to 'out.txt' file

## End(Not run)