View source: R/getenumCI2023.R
| getenumCI2023 | R Documentation | 
This is the primary analysis function for veris. calculates the point estimate and credible intervals for enumerations. (For example, 'Malware', 'Hacking', etc within 'action').
The 'by' parameter allows enumerating one feature by another, (for example to count the frequency of each action by year).
getenumCI2023(
  veris,
  enum,
  by = NULL,
  na.rm = NULL,
  unk = FALSE,
  short.names = TRUE,
  ci.method = c(),
  cred.mass = 0.95,
  ci.level = NULL,
  ci.params = FALSE,
  round.freq = 5,
  na = NULL,
  top = NULL,
  force = FALSE,
  quietly = FALSE,
  ...
)
| veris | A verisr object or arrow dataset of a verisr object | 
| enum | A veris feature or enumeration to summarize | 
| by | A veris feature or enumeration to group by | 
| na.rm | A boolean of whether to include not applicable in the sample set. This is REQUIRED if enum has a potential value of NA as there is no 'default' method for handling NAs. Instead, it depends on the hypothesis being tested. | 
| unk | A boolean referring whether to include 'unknown' in the sample. The default is 'FALSE' and should rarely be overwritten. | 
| short.names | A boolean identifying whether to use the full enumeration name or just the last section. (i.e. action.hacking.variety.SQLi vs just SQLi.) | 
| ci.method | A confidence interval method to use. Options are "mcmc" or "bootstrap". "bootstrap"uses the bayes process from the binom package. "mcmc" uses a binomial model based on rstan, rstanarm, brms. | 
| cred.mass | the amount of probability mass that will be contained in 
reported credible intervals. This argument fills a similar role as 
conf.level in  | 
| ci.level | DEPRECIATED! same as  | 
| ci.params | Set to TRUE to recieve a list column in the output of or used to recreate the model used to determine the ci. | 
| round.freq | An integer indicating how many places to round the frequency value to. (default = 5) | 
| na | DEPRECIATED! Use ' | 
| top | Integer limiting the output to top enumerations. | 
| force | getenumCI() will attempt to enforce sane confidence-based practices (such as hiding x and freq in low sample sizes). Setting force to 'TRUE' will override these best practices. | 
| quietly | When TRUE, suppress all warnings and messages. This is helpful when getenumCI is used in a larger script or markdown document. | 
| ... | A catch all for functions using arguments from previous versions of getenum. | 
Unknowns are generally excluded as 'not tested'. If 'NA' is an enumeration in the feature being enumerated, it must be specified with the 'na.rm' parameter as whether NA should be included or not is highly dependent on the hypothesis being tested.
This function accurately enumerates single logical columns, character feature columns, and features spanning multiple logical columns (such as action.*). It cannot enumerate free-form text columns. It accurately calculates the sample size 'n' as the number of rows (independent of the number of enumerations present in the feature).
GetenumCI() can also provide binomial confidence intervals for the enumerations tested within the features. See the parameters for details.
While getenumCI() may work on other types of dataframes, it was designed for verisr dataframes and data.tables. It is not tested nor recommended for any other type.
A data frame summarizing the enumeration
tmp <- tempfile(fileext = ".dat")
download.file("https://github.com/vz-risk/VCDB/raw/master/data/verisr/vcdb.dat", tmp, quiet=TRUE)
load(tmp, verbose=TRUE)
library(magrittr)
chunk <- getenumCI(vcdb, "action.hacking.variety")
chunk
chunk <- getenumCI(vcdb, "action.hacking.variety", top=10)
chunk <- getenumCI(vcdb, "action.hacking.variety", by="timeline.incident.year")
chunk
chunk <- getenumCI(vcdb, 
                   "action.hacking.variety", 
                   by="timeline.incident.year") 
chunk %>% 
    dplyr::select(by, enum, freq) %>% 
    tidyr::pivot_wider(names_from=enum, values_from=freq, values_fill = list(freq=0))
getenumCI(vcdb, "action")
getenumCI(vcdb, "asset.variety")
getenumCI(vcdb, "asset.assets.variety")
getenumCI(vcdb, "asset.assets.variety", ci.method="wilson")
getenumCI(vcdb, "asset.cloud", na.rm=FALSE)
getenumCI(vcdb, "action.social.variety.Phishing")
getenumCI(vcdb, "actor.*.motive", ci.method="wilson", na.rm=FALSE)
rm(vcdb)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.