monkey_classify: Monkeylearn classify from a dataframe column or vector of...

View source: R/monkey_classify.R

monkey_classifyR Documentation

Monkeylearn classify from a dataframe column or vector of texts


Independent classifications for each row of a dataframe using the Monkeylearn classifiers modules


monkey_classify(input, col = NULL, key = monkeylearn_key(quiet = TRUE),
  classifier_id = "cl_oFKL5wft", params = NULL, texts_per_req = NULL,
  unnest = TRUE, .keep_all = TRUE, verbose = TRUE, ...)



A dataframe or vector of texts (each text smaller than 50kB)


If input is a dataframe, the unquoted name of the character column containing text to classify


The API key


The ID of the classifier


Parameters for the module as a named list.


Number of texts to be processed per requests. Minimum value is the number of texts in input; max is 200, as per [Monkeylearn documentation]( If NULL, we default to 200, or, if there are fewer than 200 texts, the length of the input.


Should the output column be unnested?


If input is a dataframe, should non-col columns be retained in the output?


Whether to output messages about batch requests and progress of processing.


Other arguments


Find IDs of classifiers using

This function relates the rows in your original dataframe or elements in your vector to a classification particular to that row. This allows you to know which row of your original dataframe is associated with which classification. Each row of the dataframe is classified separately from all of the others, but the number of classifications a particular input row is assigned may vary (unless you specify a fixed number of outputs in params).

The texts_per_req parameter simply specifies the number of rows to feed the API at a time; it does not lump these together for classification as a group. Varying this parameter does not affect the final output, but does affect speed: one batched request of x texts is faster than x single-text requests: Even if batched, each text still counts as one query, so batching does not save you on hits to the API. See the [Monkeylearn API docs]( for more details.

You can check the number of calls you can still make in the API using attr(output, "headers")$x.query.limit.remaining and attr(output, "headers")$x.query.limit.limit.


A data.frame (tibble) with the cleaned input (empty strings removed) and a new column, nested by default, containing the classification for that particular row. Attribute is a data.frame (tibble) "headers" including the number of remaining queries as "x.query.limit.remaining".


## Not run: 
text1 <- "Hauràs de dirigir-te al punt de trobada del grup al que et vulguis unir."
text2 <- "i want to buy an iphone"
text3 <- "Je déteste ne plus avoir de dentifrice."
text_4 <- "I hate not having any toothpaste."
request_df <- tibble::as_tibble(list(txt = c(text1, text2, text3, text_4)))
monkey_classify(request_df, txt, texts_per_req = 2, unnest = TRUE)
attr(output, "headers")
## End(Not run)

masalmon/monkeylearn documentation built on May 17, 2022, 4:42 a.m.