Learnr External Evaluator Tests

library(learnr)
if (!rlang::is_installed("gradethis")) {
  stop("Please install `gradethis` from GitHub (rstudio/gradethis) for this example.")
}

knitr::opts_chunk$set(echo = FALSE)
knitr::opts_chunk$set(exercise.checker = function(...){
  library(gradethis)
  gradethis::grade_learnr(...)
})
# Setup a remote evaluator.
options(tutorial.external.host = "http://localhost:8080")
x <- "123"
library(gradethis)

Basics

rnorm(1)

Should produce one random number

plot(1,1)

Should produce an image with one point

Code that generates an error

# Reference a variable that's not defined
a123

You should see a red box that says object 'a123' not found

Invalid syntax

asdf + ___1

You should see a red box that says ... unexpected input 3: 4: asdf + _ ^

Setup Chunks

Global Setup

x

Should be 123. Defined in the setup-global-exercise chunk.

Local Setup

Test an exercise that has a matching -setup chunk.

y
y <- "abc"

y should be "abc"

Local Setup #2

Test an exercise that names its setup chunk in the knitr parameters.

y
y <- 456

y should be 456.

Local Setup #3

Test chained setup chunks for an exercise.

a <- 1
b <- a + 5
b

c should be 9 (b is 6 in setup code)

c <- b + 3
c

d should be 10

d <- c + 1
d

Timeouts

Default timeouts

By default, we set the timeout to 30s for remote evaluator requests. This request should timeout after 30-35s.

Sys.sleep(35)
print("I didn't time out!")

You should see an error message stating that the exercise timed out.

Option timeout

Using the option options(tutorial.exercise.timelimit = 5)

# TODO

Exercise-Specific Timeout

Sys.sleep(7)
print("I didn't timeout")

After 5-10 seconds, you should see an error message about the timeout being exceeded.

Code Checkers

Exercise-Specific Check

Enter the number 4, then click "Submit Answer" (not Run Code)

3
#4
gradethis::grade_result(
  gradethis::fail_if(~ identical(.result, 1), "Custom message for value 1."),
  gradethis::pass_if(~ identical(.result, 2), "Custom message for value 2."),
  gradethis::fail_if(~ identical(.result, 3), "Custom message for value 3."),
  gradethis::pass_if(~ identical(.result %% 2, 0) && (.result < 5),
          "Even number below 5")
)

If you enter 3, you should see an error: Custom message for value 3.. If you enter 4, you should see a success message with praise.

Checks that leverage setup blocks

Checks should be able to leverage the setup block to e.g. allow setup to call library(gradethis) and then have the checker use those functions without having to namespace them

The following should work just as the previous example did. Click "Submit Answer" (not Run Code)

3
#4
grade_result(
  fail_if(~ identical(.result, 1), "Custom message for value 1."),
  pass_if(~ identical(.result, 2), "Custom message for value 2."),
  fail_if(~ identical(.result, 3), "Custom message for value 3."),
  pass_if(~ identical(.result %% 2, 0) && (.result < 5),
         "Even number below 5")
)

If you enter 3, you should see an error: Custom message for value 3.. If you enter 4, you should see a success message with praise.

Check that accepts an exception

Throw an error message of "boom"

stop("boom")
stop("boom")
grade_code("Nice error!")

If you run stop("boom"), you should see a red box with boom.

If you submit stop("boom"), you should see a green box that has Nice error!.

For incorrect submission, you should see a red box with a random encouragement message.

Isolation & Malicious Code

We can test the sandboxing and isolation of the evaluator.

Network Access

download.file("http://1.1.1.1", "file.html")
readLines("file.html", n=1)

If the network is enabled, you'll see a line of HTML. If the network is disabled you'll see an error like cannot open URL 'http://1.1.1.1'.

Too-Long Init

// TODO: test that global_setups that take longer than 30s are cut off.

Recycling Environments

We should not be reusing environments. So running this chunk:

newvar <- 123

Then this one

exists("newvar")

should return FALSE

Interrupting the process

We use forked processes to evaluate exercises, so running stop() may interrupt our ability to return a meaningful response, but it shouldn't cause subsequent problems. This one's tricky to test, so we can just confirm that if we run stop:

stop()

(may return an empty red box or an error message)

subsequent exercises still work OK

rnorm(1)

should return a random number.

Memory/CPU/Process/File System limitations

TODO



Try the learnr package in your browser

Any scripts or data that you put into this service are public.

learnr documentation built on Sept. 28, 2023, 9:06 a.m.