padding_test | R Documentation |
Performs padding test vs simulations of Benford conforming datasets via percentile
padding_test(
digitdata,
data_columns = "all",
max_length = 8,
num_digits = 5,
N = 10000,
simulate = TRUE,
omit_05 = NA,
break_out = NA,
break_out_grouping = NA,
category = NA,
category_grouping = NA,
distribution = "Benford",
contingency_table = NA,
suppress_first_division_plots = NA,
plot = TRUE
)
digitdata |
A object of class |
data_columns |
The names of numeric columns of data to be analyzed. Default can be 'all', where using all data columns in |
max_length |
The length of the longest numbers considered. Defaulted to 8. |
num_digits |
The total number of digits aligned from the right to be analyzed. Defaulted to 5, meaning analyzing digit place 1s to 10ks. |
N |
The number of Benford conforming datasets to simulate.
|
simulate |
TRUE or FALSE: If TRUE, will stimulate the datasets and generate p-value. If FALSE, only produces |
omit_05 |
Whether to omit 0 or both 0 and 5. If omit both 0 and 5, pass in c(0,5) or c(5,0); if omit only 0 pass in 0 or c(0); if omit neither, pass in NA. Default to NA. |
break_out |
|
break_out_grouping |
A list of arrays, or defaulted to NA. Only effective if
|
category |
The column for splitting the data into sectors for separate analysis. The second division (usually variables) shown in plots. |
category_grouping |
A list of arrays, or defaulted to NA. Only effective if
|
distribution |
'Benford' or 'Uniform'. Case insensitive. Specifies the distribution the chi square test is testing against. Default to 'Benford'. |
contingency_table |
The user-input probability table of arbitrary distribution. Overwrites
|
suppress_first_division_plots |
TRUE or FALSE: If TRUE, suppress the display of all plots on first and second division.
If TRUE, |
plot |
TRUE or FALSE or 'Save': If TRUE, display the plots and return them. If 'Save', return the plots but suppress display. If FALSE, no plot is produced. Default to TRUE. |
A list with 4 elements
A list of p-values from Monte Carlo Simulation on each category
A list of difference in mean between observed_mean and expected_mean on each category
A sample size value that corresponds to N if simulate = TRUE
Plots for each category if plot = TRUE or 'Save'
padding_test(digitdata, omit_05=c(0,5), simulate=FALSE)
padding_test(digitdata, data_columns=c('col_name1', 'col_name2'), break_out='col_name')
padding_test(digitdata, N=100, break_out='col_name', distribution='uniform', plot='Save')
padding_test(digitdata, max_length=10, num_digits=3, omit_05=0, break_out='col_name', category='category_name')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.