Description Usage Arguments Value
View source: R/create_dt_matrices.R
Creates sparse document term matrices using labelled and unlabelled data, ready for use in xgboost algorithm.
1 2 3 4 5 6 7 8 | create_dt_matrices(
labelled_data,
unlabelled_data,
text_vars,
topics,
max_sparsity = 0.999,
val_split = 0.2
)
|
labelled_data |
Pre-processed binary labelled dataframe. |
unlabelled_data |
Pre-processed unlabelled dataframe. |
text_vars |
List of text variables to include in analysis. |
topics |
List of topics to include in analysis. |
max_sparsity |
The maximum amount of sparsity the document term matrix should have. Default: 0.999 |
val_split |
The amount of training data that should be included in the validation set. Default: 0.2 |
A complete labelled document-term matrix with corresponding labels, a labelled document-term matrix split into training and validation sets with corresponding labels, and an unlabelled document-term matrix used for predictions.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.