test_veris_hypothesis: Determine a hypothesis about Ea and Eb

View source: R/ci.R

test_veris_hypothesisR Documentation

Determine a hypothesis about Ea and Eb

Description

Use to test statements like "Hacking actions were more common than Malware" For example, you might run: vcdb dplyr::filter(plus.dbir_year == 2018) dplyr::filter(attribute.confidentiality.data_disclosure.Yes) verisr::getenumCI("action") and get: enum x n freq 1 Error 130 448 0.29018 2 Misuse 123 448 0.27455 3 Hacking 87 448 0.19420 4 Physical 82 448 0.18304 5 Social 42 448 0.09375 6 Malware 25 448 0.05580 7 Environmental 0 448 0.00000 8 Unknown 6 NA NA

Usage

test_veris_hypothesis(
  chunk,
  Ea,
  Eb,
  direction,
  ci.level = 0.05,
  reps = 1000,
  quietly = FALSE,
  visualize = FALSE
)

Arguments

chunk

getenumCI() object

Ea

Enumeration A. e.g. "action.Error"

Eb

Enumeration B. e.g. "action.Misuse"

direction

the direction to test ("greater" or "less")

ci.level

the confidence level to test against

reps

number of simulations to conduct

quietly

do not produce textual output

visualize

produce visual output

Details

You want to write in a report "Errors are more common in breaches than Misuse.", but how do you validate that? You run:

chunk verisr::test_veris_hypothesis("action.Error", "action.Misuse", "greater")

which would return 'FALSE' as they are simply too to be significantly different

Technically instead of 'true/false', the language should really be along the lines of "we have evidence for the alternative hypothesis ..." or "we do not have evidence to go against our original null hypothesis ...", but for simplicity we have left it the way it is.

WARNING: This currently only works with 'logical' columns

Value

a logical TRUE/FALSE to the hypothesis

Examples

## Not run: 
tmp <- tempfile(fileext = ".dat")
download.file("https://github.com/vz-risk/VCDB/raw/master/data/verisr/vcdb.dat", tmp, quiet=TRUE)
load(tmp, verbose=TRUE)
# test "Errors are more common in breaches than Misuse.
vcdb %>%
   filter(attribute.confidentiality.data_disclosure.Yes) %>%
   verisr::getenumCI2020("action") %>%
   verisr::test_veris_hypothesis("Error", "Misuse", "greater")
# test "Partner actors are less common in breaches than external actors"
vcdb %>%
   filter(attribute.confidentiality.data_disclosure.Yes) %>%
   verisr::getenumCI2020("actor") %>%
   verisr::test_veris_hypothesis("External", "Partner", "greater")

## End(Not run)

vz-risk/verisr documentation built on Aug. 5, 2023, 4:34 a.m.