knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library("exams2moodle")
prob_
prob_
rv_
test_
ci_
If you are generate exercise for a student you usually encounter two problems:
The second problem just requires the extension of the existing routines.
You can encounter problem 1 only if the students are not allowed to use computers to solve the exercises. Personally I believe that students only understand a statistical coefficient or graphic if they are able to calculate the required value(s) by hand; it is like learning the times tables by heart with pupils. Otherwise most of them will be just random button pushers in the future.
To solve the problems I decided to generate appropriate (full) data sets such that I have control over it. The general approach is
library("exams") library("exams2moodle") repeat { ... # some data generation if (condition_holds) break }
For example for the compution of the median from five observations $x_i$ we know that the third sorted observation $x_{(3)}$) is the solution. Of course, we have to avoid that by chance the third sorted observation is equal to the third observation, otherwise a student may miss an important step in computing the median.
library("exams") library("exams2moodle") repeat { x <- sample(1:10, size=5) sx <- sort(x) if (x[3]!=sx[3]) break } x
to_choiceFor determining the correct level of measurement of a variable we use an Excel file with two columns with the name of the variable and the level of measurement.
# subset of variables we use, variable names are in German data("skalenniveau") skalen <- c("nominal", "ordinal", "metrisch") stopifnot(all(skalenniveau$type %in% skalen)) # protect against typos skala <- sample(skalenniveau$type, 1) exvars <- sample(nrow(skalenniveau), 8) tf <- (skalenniveau$type[exvars]==skala) sc <- to_choice(skalenniveau$name[exvars], tf) # Additional answer: Does none fit? sc$questions <- c(sc$questions, "Keine der Variablen hat das gewünschte Skalenniveau") sc$solutions <- c(sc$solutions, !any(tf)) sc
The to_choice function generates a object such that can be used in answerlist and mchoice2string . The first parameter is either a vector or data frame. The second parameter is a logical vector containing TRUE if the element in the vector (or row in the data frame) contains a true answer.
The parameter shuffle samples from the correct and false answers. The following example could replace the main code from the example above.
# subset of variables we use, variable names are in German data("skalenniveau") skalen <- c("nominal", "ordinal", "metrisch") skala <- sample(skalenniveau$type, 1) exvars <- sample(nrow(skalenniveau), 8) tf <- (skalenniveau$type[exvars]==skala) # select one true and four false answers sc <- to_choice(skalenniveau$name[exvars], tf, shuffle=c(1,4)) sc
By default the answers are ordered, the parameter order which function is used to order the answers (default: order). To use the ordering given set order=NULL.
means_choice, to_choicemeans_choice computes a list of mean values for a given data vector x:
If the parameter trim and/or winsor set to NA then these means will not be computed.
digits <- 2 # round to two digits repeat { x <- round(runif(7, min=165, max=195), digits) ms <- means_choice(x, digits) if (attr(ms, "mindiff")>0.1) break # make sure that all values are different by 0.1 } ms <- unlist(ms) sc <- to_choice(ms, names(ms)=='mean') # arithmetic mean is the correct solution str(sc)
The attribute mindiff gives the minimal distance between two mean values. This might be important for setting extol the tolerance for numeric solutions.
num_resultFor computing the arithmetic mean we just need to ensure that data are rounded to a certain level since the computation with pocket calculator is simple.
library("exams2moodle") n <- sample(4:8, size=1) x <- round(runif(n, min=1.65, max=1.85), 2) sc <- num_result (c(mean(x), x), digits = 2, tolmult = 2)
The function num_result creates a list with the elements
x the original values,fx the rounded values with fmt function from the exams package as character,tolerance the tolerance for extol, anddigits the digits used for rounding.The tolerance is computed as tolmult*10^(-digits).
fractions, is_terminalTo overcome the rounding problem there is a simple approach: try to use (terminal) fractions. A terminal fraction generates a number with a finite number of digits, for example $\frac{1}{10}=0.1$.
The command fractions calls simply MASS::fractions() to avoid that you have explictly to load the library MASS. The result of calling fractions has an attribute fracswhich contains a (approximate) fraction as $\frac{numerator}{denominator}$ representation.
x <- c(1/5, 1/6) x fractions(x) str(fractions(x))
Therefore is_terminal tests if all entries are terminal fractions which means the denominators must be dividable by two and five only.
x <- c(1/5, 1/6) is_terminal(x)
Unfortunately we use a decimal numeral system limiting the number possible of denominators which lead to terminal numbers; the ancient babylonian cultures using a sexagesimal numeral system had a larger number of denominators which would lead to terminal numbers.
num_result, int_resulthtml_matrix_skSince creating tables for Moodle is still a pain, we used two approaches
as_string, as_obs), ordecided to replace a table with the $x$ values with a enumeration in LaTeX
as_stringx <- round(runif(5), 2) as_string(x) as_string(x, collapse="+", last="+")
as_obsx <- round(runif(5), 2) as_obs(x) as_obs(x, "y") as_obs(sort(x), "y", sorted=TRUE)
lsumprodx <- round(2*runif(5)-1, 2) y <- round(2*runif(5)-1, 2) lsumprod(x,y)
To make it more difficult for students to recognize similar tasks we change the story for a task. Fortunately the internet provides us with a lot of collections with statistics exercises and the stories ;)
The file are stored in ./subdir/exercisename_num.Rmd, ./subdir/exercisename_1.Rmd, ./subdir/exercisename_2.Rmd, and so on. The framework in ./subdir/exercisename_num.Rmd is shown below. Note that all story files have to create the same variables for the solution part.
```{r, echo = FALSE, results = "hide"}
library("exams")
library("exams2moodle")
ed <- exams:::.exams_get_internal("xexams_dir_exercises")
stories <- list.files(pattern="exercisename\\_[0-9]*.Rmd", path=paste0(ed, "/some_subdir"), full.names = TRUE)
story <- sample(stories, size=1)
```
Question
========
```{r, child=story}
```
Some additional common question text, e.g. for rounding the result(s).
Solution
=========
Some solution text which is basically the same for all exercisename_X.Rmd's.
Meta-information
================
extype: ...
exsolution: ...
extol: ...
exname: `r if(exists("story")) story else knitr::current_input()`
Note: if you copy the source before then you will find a zero-width space after the first backticks which will be visible in RStudio and needs to be deleted.
nowIf we randomize the task and the stories then we may have a lot of different tasks. If questions arise then we need to identify the exact task a student has.
Therefore we embed a
substring(now(), 10)
The now function uses gsub('.', '', sprintf("%.20f", as.numeric(Sys.time())), fixed=TRUE) and ensures that every time called a different number is returned.
add_dataadd_data adds data point(s) to the left and/or the right of a given data vector x.
box="range" gives a box width of width=max(x)-min(x) and two points xleft=min(x) and xright=max(x)box="box" gives a box width of width=IQR(x) and two points xleft=quantile(x, 0.25) and xright=quantile(x, 0.75)box=c(xleft, xright) gives a box width of width=xright-xleft and two points xleft and xrightnn=c(nleft, nright) gives the number of points to generate at the left and rightn=1 is a short form of c(0,1) (the default)xleft-range[2]*width; xleft-range[1]*width] are nleft points uniformly drawn and within the interval [xright+range[1]*width; xright+rang[2]*width] are nleft points uniformly drawn (both intervals are colored in red)par(mar=c(0,0,0,0)) plot(c(0, 1), c(0.15,1.15), axes=FALSE, type="n", xlab="", ylab="") rect(0.25, 0.25, 0.75, 0.75) text(0.25, 0.25, labels="xleft", pos=1) text(0.75, 0.25, labels="xright", pos=1) text(0.5, 0.75, labels="width", pos=3) arrows(0.25, 0.8, 0.75, 0.8, code=3, length=0.1) arrows(0.0, 0.5, 0.2, 0.5, code=3, col="red", length=0.1) arrows(0.8, 0.5, 1.0, 0.5, code=3, col="red", length=0.1) text(0.8, 0.8, "xright+range[1]*width", col="red", srt=90) text(1, 0.8, "xright+range[2]*width", col="red", srt=90) text(0, 0.8, "xleft-range[2]*width", col="red", srt=90) text(0.2, 0.8, "xleft-range[1]*width", col="red", srt=90)
x <- runif(7, 165, 195) xr <- add_data(x, "range", n=c(0,1), range=c(1,1.5)) round(xr) xb <- add_data(x, "box", n=c(0,1), range=c(1,1.5)) round(xb) x1 <- add_data(x, box=c(165,195), n=c(0,1), range=c(1,1.5)) round(x1)
scale_toGiven a numeric vector it uses a linear transformation to rescale the data to a given mean and standard deviation. The defult is to standardize the data
x <- runif(30) mean(x) sd(x) # y <- scale_to(x, mean=2, sd=0.5) mean(y) sd(y)
ddiscreteddiscrete generates a finite one-dimensional discrete probability distribution. If the length of x is one
then x is the number of elements. Otherwise x is considered as a starting distribution and length of x is the number of elements.
The parameter zero determines if the final distribution can contain the probability entry zero or not. Since for computation of exercises
based on a one-dimensional discrete probability distribution it favorably that the entries are fractions have the same denominator the parameter
unit can be used for this. Thus, if the smallest non-zero denominator should be 1/7 then use unit=7; the default is a power of 10.
ddiscrete(6) # fair dice x <- runif(6) ddiscrete(x) ddiscrete(x, zero=TRUE) ddiscrete(x, unit=15) fractions(ddiscrete(x, unit=15))
ddiscrete2ddiscrete2 generates a finite two-dimensional discrete probability distribution. The generation needs two steps:
The current available association measure are:
nom.cc (corrected) contingency coefficentnom.cramer Cramer's V or Phiord.spearman Spearman's rank correlationord.kendall Kendall's rank correlationr <- ddiscrete(6) c <- ddiscrete(6) ddiscrete2(r, c) ddiscrete2(r, c, FUN=nom.cc, target=0.4) ddiscrete2(r, c, FUN=nom.cc, target=1)
The units are determined as units of r multiplied with the units of c. Since a iterative process is used the parameter maxitis set to 500.
If the attribute iterations is equal to maxit then the iterative process has not been finished. The attribute target gives the asscociation value obtained.
distributionAn object of class distribution holds a distribution (of a random variable). It is specified by a name and the distribution values. The name is used create quantile (paste0("q", name)) and cumulative distribution functions (paste0("p", name)), for example
binom hypergeometric distribution with parameters: size, probhyper hypergeometric distribution with parameters: m, n, kgeom geometric distribution with parameter: probpois Poisson distribution with parameter: lambdaunif hypergeometric distribution with parameters: min, maxexp: exponential distribution with parameter: ratenorm: normal distribution with parameters: mean, sdlnorm: log-normal distribution with parameters: meanlog, sdlogt: Student t distribution with parameter: dfchisq: chi-squared distribution with parameter: dff: F distribution with parameters: df1, df2The names of the above distributions can be abbreviated, for all other the exact name must be given.
d <- distribution("t", df=15) quantile(d, c(0.025, 0.975)) d <- distribution("norm", mean=0, sd=1) cdf(d, c(-1.96, +1.96)) d <- distribution("binom", size=9, prob=0.5) pdf(d, 5)
The routines codifies rules for the approximation of distributions. If you generate dynamically exercises then you might want to avoid that an approximation is possible. However, only a part of the requirements can be checked, e.g. for the central limit theorem if more than $c$ variables are added.
t2normA $t$ distribution can be approximated by a standard normal distribution when the degree of freedom is large enough. The default is $n>30$. The default value of $30$ can be overwritten with options(distribution.t2norm=50) or explicitly set.
t2norm(40) t2norm(40, c=50)
binom2normA binomial distribution can be approximated by a normal distribution if:
The default value of $c=9$ can be overwritten with options(distribution.binom2norm=5) or explicitly set.
binom2norm(30, 0.5) binom2norm(50, 0.5) binom2norm(10, 0.5, type="double") binom2norm(30, 0.5, type="double")
clt2normThe central limit theorem allows the approximation of a sum of random variables with a normal distribution. The approximation holds if enough variables are added. The default value of $c=30$ can be overwritten with options(distribution.clt2norm=5) or explicitly set.
clt2norm(50) clt2norm(50, c=100)
means_choicemeans_choice computes a list of mean values for a given data vector x:
If the parameter trim and/or winsor set to NA then these means will not be computed.
x <- runif(7, min=165, max=195) str(means_choice(x, 2))
The attribute mindiff gives the minimal distance between two mean values.
mcvalThe function computes the mode (most common value) of data. Note that in case if the mode is not unique than all values will be returned.
# numeric x <- sample(1:5, size=25, replace = TRUE) table(x) mcval(x) # character x <- sample(letters[1:5], size=25, replace = TRUE) table(x) mcval(x) # histogram x <- hist(runif(100), plot=FALSE) mcval(x) mcval(x, exact=TRUE)
histdatahistdata computes data about the corresponding histogram to a vector like hist, but returns more information which might be necessary for exercises. In contrast to hist histdata requires that breaks covers the entire range of x.
histdata has the additional parameterprobs. If breaks="quantiles" then it determines which quantiles are used.
x <- runif(25) h1 <- hist(x, plot=FALSE) str(h1) h2 <- histdata(x) str(h2)
The returned list contains the following elements:
x: the finite data values usedclass: the class number in which a value falls starting with 1 for the first class xname: the x argument namebreaks: the class borderslower: the lower class bordersupper: the upper class borderswidth: the class widthsmid: the class midsequidist: if the classes are equidistant or notcounts: the number of observations in each classrelfreq: the relative class frequencydensity: the frequency density computed as relative frequency divided by class widthYou can compute mean, quantile, median and mode for a histogram:
x <- runif(25) h <- histdata(x) # mean mean(h) # median & quantile median(h) quantile(h) # mode mcval(h) mcval(h, exact=TRUE)
histxCreates a data set for a given set of breaks (class borders) and number of class observations. The data are generated such that the data points are uniformly distributed in each class.
breaks <- seq(1.6, 2.1, by=0.1) x <- histx (breaks, sample(5:15, length(breaks)-1)) hist(x, breaks) rug(x)
histwidthCreates histogram data sampled from a set of class widths with following properties:
hw <- histwidth(1.6, 2.1, widths=0.05*(1:4)) str(hw) x <- histx (hw$breaks, hw$n) hist(x, hw$breaks) rug(x)
combinatorics, permutation, variation, and combinationComputes all results for permutation, variation and combination with and without reptition.
variation(7,3) # without replication variation(7,3, TRUE) # with replication combination(7,3) # without replication combination(7,3, TRUE) # with replication permutation(7) permutation(7, c(2,1,4)) # three groups with indistinguishable elements combinatorics(7, 3)
The attribute mindiff gives the minimal distance between two values.
A set function which compute which compute association measure based on a contingency table:
nom.cc (corrected) contingency coefficientnom.cramer Cramer's Vord.spearman Spearman's rank correlationord.kendall Kendalls rank correlationtab <- matrix(round(10*runif(15)), ncol=5) nom.cc(tab) nom.cc(tab, correct=TRUE) nom.cramer(tab) ord.spearman(tab) ord.kendall(tab)
affix, mathAdds a prefix and/or suffix to a (character) vector.
x <- runif(5) affix(x, "$", "$") math(x)
as_string, as_obsConverts a (character) vector in a single string. If the vector is not of class character then it will be converted to character.
x <- round(runif(5),2) as_string(x) as_string(x, last=" und ") # as german as_string(x, collapse=" + ", last= " + ") # as sum # observations as_obs(x) as_obs(sort(x), name="y", sorted=TRUE)
num_result, num2strnum_result creates a list with a solution x, a (rounded) text representation fx, and a tolerance for the result.
str(num_result(pi, 3)) str(num_result(pi, 6)) str(num_result(pi, 6, tolmult=5)) str(num_result(pi, 6, tolmult=5, tolerance=1e-6))
num2str computes a list of string representation of a numeric vector using as.character.
x <- 1 str(num2str(x)) y <- 2 str(num2str(x, y)) str(num2str(x, y, z=c(x,y)))
as_tableConverts a vector into a horizontal table. The parameters are the same as in xtable which is used internally. Intended for the use as (class) frequency table.
x <- runif(5) tab <- vec2mat(x, colnames=1:length(x)) as_table(tab) tab <- vec2mat(x, colnames=sprintf("%.0f-%0.f", 0:4, 1:5)) as_table(tab)
html_matrix, zebra, tooltip, toHTMLReturns a HTML representation of a matrix as table. If you create exercises for Moodle then you can embed the HTML code in your exercise since it will be translated by exams2moodle to HTML, too.
# Copy and paste program to console library("tools") library("magrittr") library("exams2moodle") x <- matrix(1:12, ncol=3) hm <- html_matrix(x) toHTML(hm) hm <- html_matrix(x) %>% zebra() %>% tooltip(sprintf("Table has %.0f rows and %.0f columns", nrow(.), ncol(.))) toHTML(hm)
With parameters the appearance of the table can be influenced:
title entry at the top left (default: "")caption entry for the caption (default: "")names$col entry for the column names (default: colnames(x))names$row entry for the row names (default: rownames(x))style$table style for the table (default: "")style$caption style for the caption (default: "")style$title style for the caption (default: "background-color:#999999;vertical-align:top;text-align:left;font-weight:bold;")style$row style for the row names (default: "background-color:#999999;vertical-align:top;text-align:right;font-weight:bold;")style$col style for the col names (default: "background-color:#999999;vertical-align:top;text-align:right;font-weight:bold;")style$cell style for the col names (default: c("background-color:#CCCCCC; vertical-align:top; text-align:right;", "background-color:#FFFFFF; vertical-align:top; text-align:right;"))style$logical style for a logical matrix entry (default: c("background-color:#CCCCCC; vertical-align:top; text-align:right;", "background-color:#FFFFFF; vertical-align:top; text-align:right;"))style$numeric style for a numeric matrix entry (default: c("background-color:#CCCCCC; vertical-align:top; text-align:right;", "background-color:#FFFFFF; vertical-align:top; text-align:right;"))style$char style for a character matrix entry (default: c("background-color:#CCCCCC; vertical-align:top; text-align:right;", "background-color:#FFFFFF; vertical-align:top; text-align:left;")format$title$fmt parameter to format the title via sprintf (default: "\%s")format$row$fmt parameter to format the row names via sprintf (default: "\%s")format$col$fmt parameter to format the col names via sprintf (default: "\%s")format$cell$fmt parameter to format a matrix entry via sprintfformat$logical$fmt parameter to format a logical matrix entry via sprintf (default: "\%d")format$numeric$fmt parameter to format a numeric matrix entry via sprintf (default: "\%f")all_differentFor solutions in multiple choice exercises you want to ensure that the numerical results are not too near to each other.
Therefore, all_different checks if the differences between the entries in obj are larger than some given value tol.
x <- runif(20) all_different(x, 1) # minimal distance is at least 1 all_different(x, 1e-4) # minimal distance is at least 0.0001
equalCompares two numeric values if they are equal given a tolerance (default: 1e-6) by abs(x-y)<tol).
x <- pi y <- pi+1e-4 equal(x, y) equal(x, y, tol=1e-3)
firstmatchfirstmatch seeks matches for the elements of its first argument among those of its second. If multiple matches are found then
the first match is returned, for further details see charmatch.
firstmatch("d", c("chisq", "cauchy")) firstmatch("c", c("chisq", "cauchy")) firstmatch("ca", c("chisq", "cauchy"))
fractionsfractions is a copy of MASS::fractions to compute from a numeric values fractions.
d <- ddiscrete(6) fractions(d)
hyperloop, unique.elemFor generating answers for multiple choice exercises it is helpful to run the same routine several times with different input parameters. For example students may forget to divide by n-1 or divide by n instead of n. hyperloop runs about all parameter combinations.
ttest_num is a routine which computes all information required for exercises with a $t$-test.
library("exams2moodle") x <- runif(100) correct <- ttest_num(x=x, mu0=0.5, sigma=sqrt(1/12)) str(correct)
Now, let us run many $t$-tests (up to 384) with typical student errors. We extract all different test statistic and choose seven wrong answers and one correct answer with the condition that all solutions differ at least by 0.051.
res <- hyperloop(ttest_num, n = list(1, correct$n, correct$n+1), mu0 = list(correct$mu0, correct$mean), mean = list(correct$mu0, correct$mean), sigma = list(correct$sigma, correct$sd, sqrt(correct$sigma), sqrt(correct$sd)), sd = list(correct$sigma, correct$sd, sqrt(correct$sigma), sqrt(correct$sd)), norm = list(TRUE, FALSE) ) # extract all unique test statistics stat <- unlist(unique_elem(res, "statistic")) # select 7 wrong test statistic such that the difference # between all possible test statistics is at least 0.01 repeat { sc <- to_choice(stat, stat==correct$statistic, shuffle=c(1,7)) if (all_different(sc$questions, 0.005)) break } # show possible results for a MC questions sc$questions sc$solutions
alt attribute is shown!library("exams2moodle") # replace the next line with # file <- "location_of_your_downloaded_JSON_file" file <- system.file("json", "Test-Klausur-Statistik.json", package="exams2moodle") # this will create in the directory with your JSON file a PDF file json2report(file)
Named parameters which can be used:
template: a template RMarkdown file. If none given then the default file system.file("templates", "hu-german.Rmd", package="exams2moodle") will be used.author: an author name, default "".title: a report title. If none given then the base name of the JSON file is used.pdf_param: a list of parameters given to rmarkdown::pdf_document().In case that you develop your own templates you may have further parameters which you might want to replace in the template. For example if you use a named parameter subtitle then all {{subtitle}} will be replaced in the template file by the content of the parameter subtitle.
hu_german.RmdAccess the template with system.file("templates", "hu_german.Rmd", package="exams2moodle").
Note that the real work is done within the template, not in the json2report function.
cat(paste0(readLines(system.file('template', 'hu_german.Rmd', package='exams2moodle')), collapse="\n"))
question_german.RmdAccess the template with system.file("templates", "question_german.Rmd", package="exams2moodle").
cat(paste0(readLines(system.file('template', 'question_german.Rmd', package='exams2moodle')), collapse="\n"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.