knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
A basic understanding of probability and statistics is crucial for data understanding and discovery of meaningful patterns. A great way to teach probability and statistics is to start with an experiment, like rolling a dice or flipping a coin.
This package simulates rolling a dice and flipping a coin. Each experiment generates a tibble. Dice rolls and coin flips are simulated using sample(). The properties of the dice can be changed, like the number of sides. A coin flip is simulated using a two sided dice. Experiments can be combined with the pipe-operator.
tidydice package on Github: https://github.com/rolkra/tidydice
As the tidydice-functions fits well into the tidyverse, we load the dplyr-package. For quick visualisations we use the explore-package. To create more flexible graphics, use ggplot2.
library(tidydice) library(dplyr) library(explore)
6 dice are rolled 3 times using roll_dice(). The result of the dice-experiment is visualised using plot_dice().
set.seed(123) roll_dice(times = 6, rounds = 3) %>% plot_dice()
The output of roll_dice() is a tibble. Each row represents a dice roll. Without parameters, a dice is rolled once. You can use plot_dice() to visualise the result.
```{R echo=TRUE} set.seed(123) roll_dice()
```{R echo=TRUE, fig.width=6, fig.height=1} set.seed(123) roll_dice() %>% plot_dice()
Success is defined as result = 6 (as default), while result = 1..5 is not a success. In this case the result is 2, so it is no success.
If we would define result = 2 and result = 6 as success, it would be treated as success.
```{R echo=TRUE} set.seed(123) roll_dice(success = c(2,6))
#### Unfair dice As default, the dice is fair. So every result (0..6) has the same probability. If you want, you can change this. ```{R echo=TRUE} set.seed(123) roll_dice(prob = c(0,0,0,0,0,1))
In this case we created a dice that always gets result = 6 (with 100% probability)
As default the dice has 6 sides. If you want you can change this. Here we use a dice with 12 sides. result now can have a value between 1 and 12. But result = 6 is still the default success.
```{R echo=TRUE} set.seed(123) roll_dice(sides = 12)
#### Roll a dice 4 times ```{R echo=TRUE} set.seed(123) roll_dice(times = 4)
```{R echo=TRUE, fig.width=6, fig.height=1} set.seed(123) roll_dice(times = 4) %>% plot_dice()
We get 1 success #### Define rounds ```{R echo=TRUE} set.seed(123) roll_dice(times = 4, rounds = 2)
```{R echo=TRUE, fig.width=6, fig.height=2} set.seed(123) roll_dice(times = 4, rounds = 2) %>% plot_dice()
Rolling the dice 4 times is repeated. In the first round we got 1 success, in the secound round 2 success. #### Use agg A convenient way to aggregate the result, is to use the agg parameter. Now we get one line per round. ```{R echo=TRUE} set.seed(123) roll_dice(times = 4, rounds = 2, agg = TRUE)
You can aggregate by hand too using dplyr.
```{R echo=TRUE} set.seed(123) roll_dice(times = 4, rounds = 2) %>% group_by(experiment, round) %>% summarise(times = n(), success = sum(success))
#### Visualise result You can use any package/tool you like to visualise the result. In this example we use the explore-package. ```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 100) %>% explore(result, title = "Rolling a dice 100x")
In 15% of the cases we got a six. This is close to the expected value of 100/6 = 16.67%
If we increase the times parameter to 10000, the results are more balanced.
```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 10000) %>% explore(result, title = "Rolling a dice 10000x")
#### Visualise success If we repeat the experiment rolling a dice 100x with rounds = 100, we get the distribution with a peak at about 17 (16.67 is the expected value) ```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 100, rounds = 100, agg = TRUE) %>% explore(success, title = "Rolling 100 dice 100x", auto_scale = FALSE)
If we increase rounds from 100 to 10000 we get a more symmetric shape. We see that success below 5 and success above 30 are very unlikely.
```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 100, rounds = 10000, agg = TRUE) %>% explore(success, title = "Rolling 100 dice 10000x", auto_scale = FALSE)
This shape is already very close to the binomial distribution ```{R echo=TRUE, fig.height=4, fig.width=7} binom_dice(times = 100) %>% plot_binom(title = "Binomial distribution, rolling 100 dice")
```{R echo=TRUE} set.seed(123) roll_dice(times = 100, rounds = 10000, agg = TRUE) %>% mutate(check = ifelse(success < 5 | success > 30, 1, 0)) %>% count(check)
In only 4 of 10000 (0.04%) cases success is below 5 or above 30. So the probability to get this result is very low. We can check that with the binomial distribution too: ```{R echo=TRUE} binom_dice(times = 100) %>% filter(success < 5 | success > 30)
```{R echo=TRUE} binom_dice(times = 100) %>% filter(success < 5 | success > 30) %>% summarise(check_pct = sum(pct))
The probability to get this result is 0.04% (based on the binomial distribution). #### Combine experiments Let's add an experiment, where you have 10 extra dice. The shape of the distribution changes. ```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 100, rounds = 10000, agg = TRUE) %>% roll_dice(times = 110, rounds = 10000, agg = TRUE) %>% explore(success, target = experiment, title = "Rolling a dice 100/110x", auto_scale = FALSE)
You can add as many experiments as you like (as long they generate the same data structure)
Adding an experiment with times = 150 will generate a smaller but wider shape.
```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) roll_dice(times = 100, rounds = 10000, agg = TRUE) %>% roll_dice(times = 110, rounds = 10000, agg = TRUE) %>% roll_dice(times = 150, rounds = 10000, agg = TRUE) %>% explore(success, target = experiment, title = "Rolling a dice 100/110/150x", auto_scale = FALSE)
#### Binomial distribution Rolling a dice 100x, a result between 10 and 23 has a probability of over 94% ```{R echo=TRUE, fig.height=4, fig.width=7} binom_dice(times = 100) %>% plot_binom(highlight = c(10:23))
Internally the package handles coins as dice with only two sides. Success is defined as result = 2 (as default), while result = 1 is not a success.
```{R echo=TRUE} set.seed(123) flip_coin(times = 10)
In this case the result are 6x 2 and 4x 1. We can use the describe() function of the explore-package to get a good overview. ```{R echo=TRUE} set.seed(123) flip_coin(times = 10) %>% describe(success)
Or just use the agg-parameter
```{R echo=TRUE} set.seed(123) flip_coin(times = 10, agg = TRUE)
#### Define rounds The parameter rounds can be used like in roll_dice(). ```{R echo=TRUE} set.seed(123) flip_coin(times = 10, rounds = 4, agg = TRUE)
```{R echo=TRUE, fig.height=4, fig.width=7} set.seed(123) flip_coin(times = 10, rounds = 4) %>% plot_coin()
#### Combine experiments ```{R echo=TRUE} set.seed(123) flip_coin(times = 10, agg = TRUE) %>% flip_coin(times = 15, agg = TRUE)
```{R echo=TRUE} binom_coin(times = 10)
```{R echo=TRUE, fig.height=4, fig.width=7} binom_coin(times = 10) %>% plot_binom(title = "Binomial distribution,\n10 coin flips")
set.seed(123) roll_dice(times = 6) %>% plot_dice()
set.seed(123) roll_dice(times = 6) %>% plot_dice(fill = "black", line_color = "white", point_color = "white")
set.seed(123) roll_dice(times = 6) %>% plot_dice(fill = "lightblue", fill_success = "gold")
set.seed(123) roll_dice(times = 6) %>% plot_dice(fill = "darkgrey", fill_success = "darkblue", line_color = "white", point_color = "white")
set.seed(123) roll_dice(times = 6) %>% plot_dice(detailed = TRUE)
set.seed(123) roll_dice(times = 6) %>% plot_dice(detailed = FALSE)
plot_dice() is limited to 1 experiment with max. 10 times x 10 rounds.
set.seed(123) roll_dice(times = 10, rounds = 10) %>% plot_dice(detailed = FALSE, fill_success = "gold")
You can force a result using force_dice() and force_coin().
force_dice(1:6) %>% plot_dice()
force_dice(rep(6, times = 6)) %>% plot_dice()
We can combine two foreced dice rolling using the pipe operator and the parameter round.
force_dice(rep(5, times = 3), round = 1) %>% force_dice(rep(6, times = 3), round = 2)
set.seed(123) force_dice(rep(6, times = 3)) %>% roll_dice(times = 3)
In the first experiment we get 3 times a 6 (forced), but in the second experiment none.
If you want to do more complex dice rolls, use roll_dice_formula()
# roll 1 dice with 6 sides roll_dice_formula(dice_formula = "1d6", seed = 123)
roll_dice_formula( dice_formula = "4d6", # 4 dice with 6 sides success = 15:24, # success is defined as sum between 15 and 24 seed = 123 # random seed to make it reproducible )
# roll 4 dice with 6 sides roll_dice_formula( dice_formula = "4d6", # 4 dice with 6 sides rounds = 10, # repeat 10 times success = 15:24, # success is defined as sum between 15 and 24 seed = 123 # random seed to make it reproducible )
roll_dice_formula( dice_formula = "4d6e3", # 4 dice with 6 sides, explode on a 3 rounds = 5, # repeat 5 times success = 15:24, # success is defined as sum between 15 and 24 seed = 123 # random seed to make it reproducible )
roll_dice_formula( dice_formula = "4d6+1d10", # 4 dice with 6 sides + 1 dice with 10 sides rounds = 1000) %>% # repeat 1000 times explore_bar(result, numeric = TRUE) # visualise result
Other examples for dice_formula:
1d6
= roll one 6-sided dice1d8
= roll one 8-sided dice1d12
= roll one 12-sided dice2d6
= roll two 6-sided dice1d6e6
= roll one 6-sided dice, explode dice on a 63d6kh2
= roll three 6-sided dice, keep highest 2 rolls3d6kl2
= roll three 6-sided dice, keep lowest 2 rolls4d6kh3e6
= roll four 6-sided dice, keep highest 3 rolls, but explode on a 61d20+4
= roll one 20-sided dice, and add 41d4+1d6
= roll one 4-sided dice and one 6-sided dice, and sum the resultsAny scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.