weblmCalculateConditionalProbability: Calculates the conditional probability that a word follows a...

Description Usage Arguments Value Author(s) Examples

View source: R/weblmCalculateConditionalProbability.R

Description

This function calculates the conditional probability that a particular word will follow a given sequence of words. The input string must be in ASCII format.

Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.

You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing for details.

Usage

1
2
weblmCalculateConditionalProbability(precedingWords, continuations,
  modelToUse = "body", orderOfNgram = 5L)

Arguments

precedingWords

(character) Character string for which to calculate continuation probabilities. Must be in ASCII format.

continuations

(character vector) Vector of words following precedingWords for which to calculate conditional probabilities.

modelToUse

(character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body")

orderOfNgram

(integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L)

Value

An S3 object of the class weblm. The results are stored in the results dataframe inside this object. The dataframe contains the continuation words and their log(probability).

Author(s)

Phil Ferriere pferriere@hotmail.com

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
## Not run: 
 tryCatch({

   # Calculate conditional probability a particular word will follow a given sequence of words
   conditionalProbabilities <- weblmCalculateConditionalProbability(
     precedingWords = "hello world wide",       # ASCII only
     continuations = c("web", "range", "open"), # ASCII only
     modelToUse = "title",                      # "title"|"anchor"|"query"(default)|"body"
     orderOfNgram = 4L                          # 1L|2L|3L|4L|5L(default)
   )

   # Class and structure of conditionalProbabilities
   class(conditionalProbabilities)
   #> [1] "weblm"

   str(conditionalProbabilities, max.level = 1)
   #> List of 3
   #>  $ results:'data.frame':  3 obs. of  3 variables:
   #>  $ json   : chr "{"results":[{"words":"hello world wide","word":"web", __truncated__ }]}
   #>  $ request:List of 7
   #>   ..- attr(*, "class")= chr "request"
   #>  - attr(*, "class")= chr "weblm"

   # Print results
   pandoc.table(conditionalProbabilities$results)
   #> -------------------------------------
   #>      words        word   probability
   #> ---------------- ------ -------------
   #> hello world wide   web      -0.32
   #>
   #> hello world wide range     -2.403
   #>
   #> hello world wide  open      -2.97
   #> -------------------------------------

 }, error = function(err) {

   # Print error
   geterrmessage()

 })

## End(Not run)

mscsweblm4r documentation built on May 2, 2019, 3:46 p.m.