`riskyr` User Guide

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

"Solving a problem simply means representing it so as to make the solution transparent."
(H.A. Simon)^[Simon, H.A. (1996). The Sciences of the Artificial (3rd ed.). The MIT Press, Cambridge, MA. (p. 132).]

What is the probability of a disease or clinical condition given a positive test result? This seems a simple and fairly common question, yet doctors, patients and medical students find it surprisingly difficult to answer.

Decades of research on probabilistic reasoning and risk literacy have shown that people are perplexed and struggle when information is expressed in terms of probabilities, but have no problem to understand and process the same information when it is expressed in terms of natural frequencies (see Gigerenzer and Hoffrage, 1995; Gigerenzer et al., 2007; Hoffrage et al., 2015; for overviews).

riskyr

riskyr is a toolbox for rendering risk literacy more transparent by facilitating such changes in representation and offering multiple perspectives on the dynamic interplay between probabilities and frequencies. The main goal of riskyr is to provide a long-term boost in risk literacy by fostering competence in understanding statistical information in domains such as health, weather, and finances (Hertwig & Grüne-Yanoff, 2017).

This guide first illustrates a typical problem and then helps you solving it by viewing risk-related information in a variety of ways. It proceeds in three steps:

  1. We will first present a typical problem in the probabilistic format that is commonly used in textbooks. This allows introducing some key probabilities, but also explains why both this problem and its traditional solution (via Bayes' formula) remains opaque and is rightfully perceived as difficult.

  2. We will then translate the problem into natural frequencies and show how this facilitates its comprehension and solution.

  3. Finally, we show how riskyr renders the problem more transparent by providing three sets of tools:

    A. A fancy calculator that allows the computation of probabilities and frequencies;

    B. A set of functions that translate between different representational formats;

    C. A variety of visualizations that illustrate relationships between frequencies and probabilities.

Motivation: A Problem of Probabilities

A basic motivation for developing riskyr was to facilitate our understanding of problems like the following:

Mammography screening

The probability of breast cancer is 1% for a woman at age 40 who participates in routine screening.
If a woman has breast cancer, the probability is 80% that she will get a positive mammography.
If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography.

A woman in this age group had a positive mammography in a routine screening.
What is the probability that she actually has breast cancer?

(Hoffrage et al., 2015, p. 3)

Information provided and asked

Problems like this tend to appear in texts and tutorials on risk literacy and are ubiquitous in medical diagnostics. They typically provide some risk-related information (i.e., specific probabilities of some clinical condition and likelihoods of some decision or test of detecting its presence or absence) and ask for some other risk-related quantity. In the most basic type of scenario, we are given 3 essential probabilities:

  1. The prevalence of some target population (here: women at age 40) for some condition (breast cancer):

  2. prev = $p(\mathrm{cancer}) = 1\%$

  3. The sensitivity of some decision or diagnostic procedure (here: a mammography screening test), which is the conditional probability:

  4. sens = $p(\mathrm{positive\ test}\ |\ \mathrm{cancer}) = 80\%$

  5. The false alarm rate of this decision, diagnostic procedure or test, which is the conditional probability:

  6. fart = $p( \mathrm{positive\ test}\ |\ \mathrm{no\ breast\ cancer} ) = 9.6\%$

and can also be expressed by its complement (aka. the test's specificity):

The first challenge in solving this problem is to realize that the probability asked for is not the sensitivity sens (i.e., the probability of a positive test given cancer), but the reversed conditional probability (i.e., the probability of having cancer given a positive test). The clinical term for this quantity is the positive predictive value (PPV) or the test's precision:

How can we compute the positive predictive value (PPV) from the information provided by the problem? In the following, we sketch three different paths to the solution.

Using Bayes' formula

One way to solve problems concerning conditional probabilities is to remember and apply Bayes' formula (which is why such problems are often called problems of "Bayesian reasoning"):

$$ p(H|D) = \frac{p(H) \cdot p(D|H) } {p(H) \cdot p(D|H) + p(\neg H) \cdot p(D|\neg H) } $$

In our example, we are looking for the probability of breast cancer ($H$) given a positive mammography test ($D$):

$$ p(\mathrm{cancer}\ |\ \mathrm{positive\ test}) = \frac{p(\mathrm{cancer}) \cdot p(\mathrm{positive\ test}\ |\ \mathrm{cancer}) } {p(\mathrm{cancer}) \cdot p(\mathrm{positive\ test}\ |\ \mathrm{cancer}) + p(\mathrm{no\ cancer}) \cdot p(\mathrm{positive\ test}\ |\ \mathrm{no\ cancer}) } $$

By inserting the probabilities identified above and knowing that the probability for the absence of breast cancer in our target population is the complementary probability of its presence (i.e., $p(\mathrm{no\ cancer}) = 1 - $prev = 99\%$) we obtain:

$$ p(\mathrm{cancer}\ |\ \mathrm{positive\ test}) = \frac{1\% \cdot 80\% } { 1\% \cdot 80\% + 99\% \cdot 9.6\% } \approx\ 7.8\%$$

Thus, the information above and a few basic mathematical calculations tell us that the likelihood of a woman in our target population with a positive mammography screening test actually having breast cancer (i.e., the PPV of this mammography screening test) is slightly below 8\%.

Using Natural Frequencies

If you fail to find the Bayesian solution easy and straightforward, you are in good company: Even people who have studied and taught statistics find it difficult to think in these terms. Fortunately, researchers have found that a simple change in representation renders the same information much more transparent.

Consider the following problem description:

Mammography screening (freq)

10 out of every 1000 women at age 40 who participate in routine screening have breast cancer.
8 out of every 10 women with breast cancer will get a positive mammography.
95 out of every 990 women without breast cancer will also get a positive mammography.

Here is a new representative sample of women at age 40 who got a positive mammography in a routine screening.
How many of these women do you expect to actually have breast cancer?

(Hoffrage et al., 2015, p. 4)

Importantly, this version (freq) of the problem refers to a frequency of $1000$ individuals of our original target population. It still provides the same probabilities as above, but specifies them in terms of natural frequencies (see Gigerenzer & Hoffrage, 1999, and Hoffrage et al., 2002, for clarifications of this concept):

  1. The prevalence of breast cancer in the target population:

  2. prev = $p(\mathrm{cancer}) = \frac{10}{1000} (= 1\%)$

  3. The sensitivity of the mammography screening test, which is the conditional probability:

  4. sens = $p(\mathrm{positive\ test}\ |\ \mathrm{cancer}) = \frac{8}{10} (= 80\%)$

  5. The test's false alarm rate, which is the conditional probability:

  6. fart = $p( \mathrm{positive\ test}\ |\ \mathrm{no\ breast\ cancer} ) = \frac{95}{990} (\approx\ 9.6\%)$

and can still be expressed by its complement (the test's specificity):

Rather than asking us to compute a conditional probability (i.e., the PPV), the task now prompts us to imagine a new representative sample of women from our target population and focuses on the women with a positive test result. It then asks for a frequency: "How many of these women" do we expect to have cancer?

To provide any answer in terms of frequencies, we need to imagine a specific sample size $N$. As the problem referred to a population of $1000$ women, we conveniently pick a sample size of $N = 1000$ women with identical characteristics (which is suggested by mentioning a "representative" sample) and ask: How many women with a positive test result actually have cancer?^[The actual sample size N chosen is irrelevant, but the numbers are easier to calculate when N is a round number and at least as large as the frequencies mentioned in the problem.]

In this new sample, the frequency of women with cancer and with a positive test result should match the numbers of the original sample. Hence, we can assume that $10$ out of $1000$ women have cancer (prev) and $8$ of the $10$ women with cancer receive a positive test resul (sens). Importantly, $95$ out of the $990$ women without cancer also receive a positive test result (fart). Thus, the number of women with a positive test result is $8 + 95 = 103$, but only $8$ of them actually have cancer. Of course the ratio $\frac{8}{103}$ is identical to our previous probability (of slightly below 7.8\%). Incidentally, the reformulation in terms of frequencies protected us from erroneously taking the sensitivity (of sens = $\frac{8}{10} = 80\%$) as an estimate of the desired frequency. Whereas it is easy to confuse the term $p( \mathrm{positive\ test}\ |\ \mathrm{cancer} )$ with $p( \mathrm{cancer}\ |\ \mathrm{positive\ test} )$ when the task is expressed in terms of probabilities, it is clearly unreasonable to assume that about 800 of 1000 women (i.e., 80%) actually have cancer (since the prevalence in the population was specified to be 10 in 1000, i.e., 10\%). Thus, reframing the problem in terms of frequencies made us immune against a typical mistake.

Using riskyr

Reframing the probabilistic problem in terms of frequencies made its solution easier. This is neat and probably one of the best tricks in risk literacy education (as advocated by Gigerenzer & Hoffrage, 1995; Gigerenzer 2002; 2014). While it is good to have a way to cope with tricky problems, it would be even more desirable to actually understand the interplay between probabilities and frequencies in risk-related tasks and domains. This is where riskyr comes into play.^[Full disclosure: As former students of Gerd Gigerenzer, we think that his recommendations are insightful, convincing, and correct. However, while expressing probabilities in terms of natural frequencies promotes a better understanding of risks, it does not automatically lead to a better understanding of conditional probabilities per se. riskyr extends beyond mere translations between representational formats by showing the interplay between frequencies and probabilities in a variety of ways.]

riskyr

riskyr provides a set of basic risk literacy tools in R. As we have seen, the problems humans face when dealing with risk-related information are less of a computational, and more of a representational nature. As a statistical programming language, R is a pretty powerful computational tool, but for our present purposes it is more important that R is also great for designing and displaying aesthetic and informative visualizations. By applying these qualities to the task of training and instruction in risk literacy, riskyr is a toolbox that renders risk literacy education more transparent.

riskyr promotes a deeper understanding of risk-related information in three ways:^[The riskyr logo (showing three facets of a dice) also represents its functionality in a variety of ways:
1. First, each facet provides a frequency (e.g., the number 3), though the dice is a paradigmatic example of a device that generates probabilities. (See Strevens, 2013, for inferring probabilisitc properties from physical devices.)
2. More importantly, each facet is informative by itself --- and often all that is of interest. However, to really understand the mechanism of the risk-generating device, it is crucial to view it from multiple angles. riskyr provides alternative perspectives that --- when viewed together --- render issues of risk literacy more transparent.
3. The three facets can be counted as the three steps of (a) organizing information, (b) translating between representational formats, and (c) visualising relationships between variables. ]

In the following, we show how we could address the above problem by using three types of tools provided by riskyr.

A. A fancy calculator

riskyr provides a set of functions that allows us to calculate various desired outputs (probabilities and frequencies) from given inputs (probabilities and frequencies). For instance, the following function computes the positive predictive value PPV from the 3 basic probabilities prev, sens, and spec (with spec = 1 -- fart) that were provided in the original problem:

library("riskyr")  # loads the package

comp_PPV(prev = .01, sens = .80, spec = (1 - .096))

It's good to know that riskyr can apply Bayes' formula, but so can any other basic calculator --- including by brain on a good day and some environmental support in the form of paper and pencil. The R in riskyr only begins to make sense when considering functions like the following:

# Compute probabilities from 3 essential probabilities:                 # Input arguments:
p1 <- comp_prob_prob(prev = .01, sens = .80, spec =   NA, fart = .096)  # prev, sens, NA,   fart
p2 <- comp_prob_prob(prev = .01, sens = .80, spec = .904, fart =   NA)  # prev, sens, spec, NA 
p3 <- comp_prob_prob(prev = .01, sens = .80, spec = .904, fart = .096)  # prev, sens, spec, fart

# Check equality of outputs:
all.equal(p1, p2)
all.equal(p2, p3)

The function comp_prob_prob computes probabilities from probabilities (hence its name). The probabilities provided need to include a prevalence prev, a sensitivity sens, and either the specificity spec or the false alarm rate fart (with spec = 1 -- fart). The code above shows 3 different ways in which 3 of these "essential" probabilities can be provided (and hence the objects p1, p2, and p3 are all equal to each other).

The probabilities computed by these "essential" probabilities include the PPV, which can be obtained by asking for p1$PPV = r p1$PPV. But the object computed by comp_prob_prob is actually a list of 10 probabilities and can be inspected by printing p1:

p1

The list of probabilities computed includes the 3 essential probabilities (prev, sens, and spec or fart) and the desired probability (p1$PPV = r p1$PPV), but also many other probabilities that may have been asked instead. (See the vignette on data formats for details on these probabilities.)

Incidentally, as R does not case whether probabilities are entered as decimal numbers or fractions, we can check whether the 2nd version of our problem --- the version reframed in terms of frequencies --- yields the same solution:

# Compute probabilities from 3 ratios of frequencies (probabilities):       # Input arguments:
p4 <- comp_prob_prob(prev = 10/1000, sens = 8/10, spec = NA, fart = 95/990) # prev, sens, NA, fart

p4$PPV

This shows that the PPV computed in this version is only marginally different (p4$PPV = r p4$PPV). More importantly, it is identical to the ratio $\frac{8}{103}$ = r 8/103.

B. Translating between formats

Another function of riskyr is to translate between representational formats. This translation comes in two varieties:

Computing frequencies from probabilities

# Compute frequencies from probabilities:
f1 <- comp_freq_prob(prev =     .01, sens =  .80, spec = NA, fart =   .096, N = 1000)
f2 <- comp_freq_prob(prev = 10/1000, sens = 8/10, spec = NA, fart = 95/990, N = 1000)

# Check equality of outputs:
all.equal(f1, f2)

By providing our original probabilities to the function comp_freq_prob we can compute a list of frequencies from probabilities (hence the name). To compute frequencies for the specific sample size of 1000 individuals, we need to provide N = 1000 as an additional argument. As before, it does not matter whether the probabilities are supplied as decimal numbers or as ratios (as long as they actually are probabilities, i.e., numbers from 0 to 1).

As the ratio fart = 95/990 is not exactly equal to fart = .096 (but rather 95/100 = r 95/100) the two versions of our problem actually vary by a bit. Here, the results f1 and f2 are only identical because the function comp_freq_prob rounds to nearest integers by default. To compute more precise frequencies (that no longer round to integers), use the round = FALSE argument:

# Compute frequencies from probabilities (without rounding):
f3 <- comp_freq_prob(prev =     .01, sens =  .80, spec = NA, fart =   .096, N = 1000, round = FALSE)
f4 <- comp_freq_prob(prev = 10/1000, sens = 8/10, spec = NA, fart = 95/990, N = 1000, round = FALSE)

# Check equality of outputs:
all.equal(f3, f4)  # => shows slight differences in some frequencies:

As before, the function comp_freq_prob does not compute only one frequency, but a list of 11 frequencies. Their names and values can be inspected by printing f1:

f1

In this list, the sample of N = $1000$ women is split into 3 different subgroups. For instance, the $10$ women with cancer appear as cond.true cases, whereas the 990 without cancer are listed as cond.false cases. The $8$ women with cancer and a positive test result appear as hits hi and the 95 women who receive a positive test result without having cancer are listed as false alarms fa. (See the vignette on data formats for details on all frequencies.)

Computing probabilities from frequencies

A translator between two representational formats should work in both directions. Consequently, riskyr also allows to compute probabilities by providing frequencies:

# Compute probabilities from frequencies:
p5 <- comp_prob_freq(hi = 8, mi = 2, fa = 95, cr = 895)  # => provide 4 essential frequencies

Fortunately, the comp_prob_freq does not require all 11 frequencies that were returned by comp_freq_prob and contained in the list of frequencies f1. Instead, we must provide comp_prob_freq with the 4 essential frequencies that were listed as hi, mi, fa, and cr in f1. The resulting probabilities (saved in p5) match our list of probabilities from above (saved in p4):

# Check equality of outputs:
all.equal(p5, p4)

Switching back and forth

More generally, when we translate between formats twice --- first from probabilities to frequencies and then from the resulting frequencies to probabilities --- the original probabilities appear again:

# Pick 3 random probability inputs:
rand.p <- runif(n = 3, min = 0, max = 1)
rand.p

# Translation 1: Compute frequencies from probabilities:
freq <- comp_freq_prob(prev = rand.p[1], sens = rand.p[2], spec = rand.p[3], round = FALSE)  # without rounding!

# Translation 2: Compute probabilities from frequencies:
prob <- comp_prob_freq(hi = freq$hi, mi = freq$mi, fa = freq$fa, cr = freq$cr)

# Verify that results match original probabilities: 
all.equal(prob$prev, rand.p[1])
all.equal(prob$sens, rand.p[2])
all.equal(prob$spec, rand.p[3])

Similarly, going full circle from frequencies to probabilities and back returns the original frequencies:

# Pick 4 random frequencies:
rand.f <- round(runif(n = 4, min = 0, max = 10^3), 0)
rand.f  
# sum(rand.f)

# Translation 1: Compute probabilities from frequencies:
prob <- comp_prob_freq(hi = rand.f[1], mi = rand.f[2], fa = rand.f[3], cr = rand.f[4])
# prob

# Translation 2: Compute frequencies from probabilities (for the original population size N):
freq <- comp_freq_prob(prev = prob$prev, sens = prob$sens, spec = prob$spec, N = sum(rand.f), round = FALSE)  # without rounding!
# freq

# Verify that results match original frequencies: 
all.equal(freq$hi, rand.f[1])
all.equal(freq$mi, rand.f[2])
all.equal(freq$fa, rand.f[3])
all.equal(freq$cr, rand.f[4])

To obtain the same results when translating back and forth between probabilities and frequencies, it is important to switch off rounding when computing frquencies from probabilities with comp_freq_prob. Similarly, we need to scale the computed frequencies to the original population size N to arrive at the original frequencies.

C. Visualizing relationships between formats and variables

Inspecting the lists of probabilities and frequencies shows that the two problem formulations cited above are only two possible instances out of an array of many alternative formulations. Essentially, the same scenario can be described in a variety of variables and formats. Gaining deeper insights into the interplay between these variables requires a solid understanding of the underlying concepts and their mathematical definitions. To facilitate the development of such an understanding, riskyr recruits the power of visual representations and shows the same scenario from a variety of angles and perspectives. It is mostly this graphical functionality that supports riskyr's claim on being a toolbox for rendering risk literacy more transparent. Thus, in addition to being a fancy calculator and a translator between formats, riskyr is mostly a machine that turns risk-related information into pretty pictures.

riskyr provides many alternative visualizations that depict the same risk-related scenario in the form of different representations. As each type of graphic has its own properties and perspective --- strengths that emphasize or illuminate some particular aspect and weaknesses that hide or obscure others --- the different visualizations are somewhat redundant, yet complement and support each other.

Here are some examples that depict the scenario described above:

Icon array

A straightforward way of plotting an entire population of individuals is provided by an icon array that represents each individual as a symbol which is color-coded:

plot_icons(prev = .01, sens = .80, spec = NA, fart = .096, N = 1000, 
           icon.types = c(21, 21, 22, 22),
           title.lbl = "Mammography screening")

Tree diagram

Perhaps the most intuitive visualization of the relationships between probability and frequency information in our above scenario is provided by a tree diagram that shows the population and the frequency of subgroups as its nodes and the probabilities as its edges:

plot_tree(prev = .01, sens = .80, spec = NA, fart = .096, N = 1000, 
          title.lbl = "Mammography screening")

Importantly, the plot_tree function is called with the same 3 essential probabilities (prev, sens, and spec) and 1 frequency (the number of individuals N of our sample or population). But in addition to computing risk-related information (e.g., the number of individuals in each of the 4 subgroups at the 2nd level of the tree), the tree diagram visualizes crucial dependencies and relationships between concepts and quantities. For instance, the diagram illustrates that the number of true positives (hi) depends on both the condition's prevalance (prev) and the decision's sensitivity (sens), or that the decision's specificity spec can be expressed and computed as the ratio of the number of true negatives (cr) divided by the number of unaffected individuals (cond.false cases).

For details and additional options of the plot_tree function, see the documentation of ?plot_tree.

Mosaic plot

An alternative way to split a group of individuals into subgroups depicts the population as a square and dissects it into various rectangles that represent parts of the population. In the following mosaic plot, the relative proportions of rectangle sizes represent the relative frequencies of the corresponding subgroups:

plot_mosaic(prev = .01, sens = .80, spec =   NA, fart = .096, N = 1000,
            title.lbl = "Mammography screening")

The vertical split dissects the population into two subgroups that correspond to the frequency of cond.true and cond.false cases in the tree diagram above. The prev value of 1\% yields a slim vertical rectangle on the left.

For details and additional options of the plot_mosaic function, see the documentation of ?plot_mosaic.

Alternative perspectives

Both the tree diagram and the mosaic plot shown above adopted a particular perspective by splitting the population into 2 subgroups by condition (via the default option by = "cd"). Rather than emphasizing the difference between cond.true and cond.false cases, an alternative perspective could ask: How many people are detected as positive vs. negative by the test? By using the option by = "dc", the tree diagram first splits the population into dec.pos and dec.neg cases:

plot_tree(prev = .01, sens = .80, spec =   NA, fart = .096, N = 1000, 
          by = "dc", 
          title.lbl = "Mammography screening",
          dec.pos.lbl = "positive test",
          dec.neg.lbl = "negative test")

Similarly, the population area of the mosaic plot can be split horizontally by using the option vsplit = FALSE:

plot_mosaic(prev = .01, sens = .80, spec =   NA, fart = .096, N = 1000,
            vsplit = FALSE, 
            title.lbl = "Mammography screening")

riskyr uses a consistent color scheme to represent the same subgroups across different graphs. If this color coding is not sufficient, plotting the tree diagram with the option area = "hr" further highlights the correspondence by representing the relative frequencies of subgroups by the proportions of rectangles:

plot_tree(prev = .01, sens = .80, spec =   NA, fart = .096, N = 1000, 
          by = "dc",
          area = "hr", 
          title.lbl = "Mammography screening",
          dec.pos.lbl = "positive test",
          dec.neg.lbl = "negative test")

Incidentally, as both an icon array and a mosaic plot depict probability by area size, both representations can be translated into each other. This is still visible when relaxing the positional constraint of icons in the icon array:

plot_icons(prev = .01, sens = .80, spec = NA, fart = .096, N = 1000, block.d = 0.01,
           type = "mosaic",
           icon.types = c(21, 21, 22, 22),
           title.lbl = "Mammography screening")

Can you spot cases of hits (true positives) and misses (false negatives)? (Hint: Their frequency is 8 and 2, respectively.)

Network plot

The following network diagram is a generalization of the tree diagram. It plots all 9 different frequencies (computed by comp_freq_prop and comp_freq_freq and contained in freq) as nodes of a single graph and depicts all 10 probabilities (computed by comp_prop_prop and comp_prop_freq and contained in prob) as edges between these nodes. Thus, the network diagram integrates both perspectives of the above tree diagrams:

plot_fnet(prev = .01, sens = .80, spec =   NA, fart = .096, N = 1000, 
          title.lbl = "Mammography screening")

In addition to showing the interplay between all key frequencies and probabilities, the network diagram notes accuracy metrics that are based on the confusion matrix depicted as the middle row 4 central nodes (hi, mi, fa, and cr).

For details and additional options of the plot_fnet function, see the documentation of ?plot_fnet.

References

Contact

spds.uni.kn

We appreciate your feedback, comments, or questions.

All riskyr Vignettes

riskyr

| Nr. | Vignette | Content |
| ---: |:---------|:-----------| | A. | User guide | Motivation and general instructions | | B. | Data formats | Data formats: Frequencies and probabilities | | C. | Confusion matrix | Confusion matrix and accuracy metrics | | D. | Functional perspectives | Adopting functional perspectives | | E. | Quick start primer | Quick start primer |



Try the riskyr package in your browser

Any scripts or data that you put into this service are public.

riskyr documentation built on Feb. 19, 2018, 5 p.m.