View source: R/2_1_textTrain.R
textTrainLists | R Documentation |
Individually trains word embeddings from several text variables to several numeric or categorical variables.
textTrainLists(
x,
y,
force_train_method = "automatic",
save_output = "all",
method_cor = "pearson",
eval_measure = "rmse",
p_adjust_method = "holm",
...
)
x |
Word embeddings from textEmbed (or textEmbedLayerAggreation). It is possible to have word embeddings from one text variable and several numeric/categorical variables; or vice verse, word embeddings from several text variables to one numeric/categorical variable. It is not possible to mix numeric and categorical variables. |
y |
Tibble with several numeric or categorical variables to predict. Please note that you cannot mix numeric and categorical variables. |
force_train_method |
(character) Default is "automatic"; see also "regression" and "random_forest". |
save_output |
(character) Option not to save all output; default "all". See also "only_results" and "only_results_predictions". |
method_cor |
(character) A character string describing type of correlation (default "Pearson"). |
eval_measure |
(character) Type of evaluative measure to assess models on (default "rmse"). |
p_adjust_method |
Method to adjust/correct p-values for multiple comparisons. (default = "holm"; see also "none", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr"). |
... |
Arguments from textTrainRegression or textTrainRandomForest (the textTrain function). |
Correlations between predicted and observed values (t-value, degree of freedom (df), p-value, confidence interval, alternative hypothesis, correlation coefficient) stored in a dataframe.
See textTrain
, textTrainRegression
and textTrainRandomForest
.
# Examines how well the embeddings from Language_based_assessment_data_8 can
# predict the numerical numerical variables in Language_based_assessment_data_8.
# The training is done combination wise, i.e., correlations are tested pair wise,
# column: 1-5,1-6,2-5,2-6, resulting in a dataframe with four rows.
## Not run:
word_embeddings <- word_embeddings_4$texts[1:2]
ratings_data <- Language_based_assessment_data_8[5:6]
trained_model <- textTrainLists(
x = word_embeddings,
y = ratings_data
)
# Examine results (t-value, degree of freedom (df), p-value,
# alternative-hypothesis, confidence interval, correlation coefficient).
trained_model$results
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.