extract_var_imp | R Documentation |
Inside enrich_annotation_file()
, the feature relevance for the
classification model is estimated from the Document Term Matrix (DTM) and
stored in the Annotation file. In the case of the default BART model, the
feature importance is the rate of posterior trees in which a term was used,
plus its Z score if an ensemble of models is used.
extract_var_imp( session_name, num_vars = 15, score_filter = 1.5, recompute_DTM = FALSE, sessions_folder = getOption("baysren.sessions_folder", "Sessions") )
session_name |
The name of a session. |
num_vars |
The number of best features to report, according to model importance. |
score_filter |
The model related Z score can be used to filter less relevant features. |
recompute_DTM |
Whether to recompute the DTM. |
sessions_folder |
The folder in which all sessions are stored. |
In addition to the model derived scores, the variable importance according to a Poisson regression is used to estimate the association (as log-linear regressor and Z score) of a term with relevant records. This approach is helpful to distinguish between terms being relevant by themselves (both the model related and the linear Z scores are high) or in association with other terms (only the model Z score is high).
A data frame with the features (and the part of the record they are related to), the model importance score and its Z score (if an ensemble of models is used), the log-linear association according to the Poisson model and the linear Z score.
## Not run: extract_var_imp("Session1") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.