classification: Classify samples by taxonomic composition
In walterxie/ComMA: Community Matrix Analysis

Description Usage Arguments Details Examples

Build a classification model to classify or predict the certain type of samples (e.g. land types).

ctreeClassification(comm, attr.data, group.id, levels = c(), ...)

lassoClassification(relative.abund, attr.data, group.id, percent = 0.01,
  alpha = 1, family = "multinomial", nlambda = 500,
  coef.s = "lambda.min", return.df = TRUE, ...)

`comm`	A community matrix, which can be either abundance or relative abundance.
`group.id`	The column name in `attr.data` contains the known groups to compare with enterotypes.
`levels`	Levels to order data by `group.id`.
`relative.abund, attr.data, percent`	Refer to `enterotype`.
`alpha, family, nlambda, ...`	Refer to `cv.glmnet`.
`coef.s`	Which coeffient to return by `coef`. `lambda.min`, as default, is the value of lambda that gives minimum mean cross-validated error. `lambda.1se` gives the most regularized model such that error is within one standard error of the minimum.

Create a conditional inference tree ctree for the classification of samples. Ctree uses a significance test procedure in order to select variables instead of selecting the variable that maximizes an information measure (e.g. Gini coefficient). http://stats.stackexchange.com/questions/12140/conditional-inference-trees-vs-traditional-decision-trees.

Build a classification model using lasso cv.glmnet, given relative abundance of taxonomic compositions (e.g. families) in the samples, and also to find which taxonomic compositions are important.

model <- ctreeClassification(comm, attr.data=env, group.id="land.use")
plot(model$ctree)

cvfit <- lassoClassification(relative.abund, attr.data=env, group.id="land.use")