dfm_scaling_test: Comparison of dfms using N-dimensional scaling, with a test...
In preText: Diagnostics to Assess the Effects of Text Preprocessing Decisions

Description Usage Arguments Value Examples

Scale each dfm into a N-d space and test for outliers.

1
2
3

dfm_scaling_test(scaling_results, labels, dimensions = 2,
  distance_method = "cosine", method = c("distances", "positions"),
  return_positions = FALSE)

`scaling_results`	A list object produced by the 'scaling_comparison()' function.
`labels`	A character vector with labels for each dfm. This can be extracted from the '$labels' field of the output from the 'factorial_preprocessing()' function.
`dimensions`	The number of dimensions to be used by the multidimensional scaling algorithm. Defaults to 2.
`distance_method`	The method that should be used for calculating distances between dfms. Defaults to "cosine".
`method`	Should the raw distances or scaled document positions be used for scaling? Can be one of c("distances","positions"), defaults to "distances".
`return_positions`	Logical indicating whether dfm positions should be returned as a data.frame. Defaults to FALSE

A result list object, or a plot, or both.

## Not run: 
# *** This function is used automatically inside of the preText() function.
# load the package
library(preText)
# load in the data
data("UK_Manifestos")
# preprocess data
preprocessed_documents <- factorial_preprocessing(
    UK_Manifestos,
    use_ngrams = TRUE,
    infrequent_term_threshold = 0.02,
    verbose = TRUE)
# scale documents
scaling_results <- scaling_comparison(preprocessed_documents$dfm_list,
                                      dimensions = 2,
                                      distance_method = "cosine",
                                      verbose = TRUE)
# now perform the scaling test
dfm_scaling_test(scaling_results,
                 labels = preprocessed_documents$labels)

## End(Not run)