Description Usage Arguments Value Functions required Examples
View source: R/expl_cond_dist_tbl.R
This function will return a table of the cumulative conditional distribution for banded variables. The definition for this is (# of entries in both the band of main_var and the band of cond_var)/#(number of entries in the band of cond_var).
This tells us "Given that the entry is in the band of cond_var, what is the probability they are in this band of main_var?" This can be useful for things like assessing model refreshes. This can answer questions such as: "Given that the score on the old model is in the band for 500-750, what is the percentage of these scores end up in each band for the new model?" This could be done with expl_band_cond_shift(main_var=df$new_score, cond_var=df$old_score). You would look for the column with your 500-750 band (although normally notated with just "750").
1 2 3 4 5 6 7 8 9  | expl_cond_dist_tbl(
  main_var,
  cond_var,
  output_var = "thin",
  NA_val = "_NA",
  warn_high_band = 50L,
  err_high_band = 100L,
  verbose = TRUE
)
 | 
main_var | 
 Array[Character]: This is a banded version of the variable which you would like to assess dependent on another variable. If doing a model refresh, this would be the new score.  | 
cond_var | 
 Array[Character]: This is a banded version of the variable which we are conditioning on. If doing a model refresh, this would be the old score.  | 
output_var | 
 Character (Default: "thin"): This is a choice of which form of output should be given. Options are: ("thin", "prop", "count"). See "Value".  | 
NA_val | 
 Character/Numeric/NA (Default: "_NA"): NA replacement value.  | 
warn_high_band | 
 Numeric (Default: 50): This is a variable which will be used to set how many bands are needed to generate a warning. If you do not want this, then just set it above err_high_band.  | 
err_high_band | 
 Numeric (Default: 100): This is a variable which will cause an error if the number of bands is exceeded  | 
verbose | 
 Logical (Default: TRUE): This is a variable which is used to determine if we want to print a wide version of the table.  | 
DataFrame: The format of this dataframe is dependent on output_var. If output_var = "thin", then the output will be a table with each row being a unique combination of main_var and cond_var. If output_var = "count", then the output will be the wide table with the amount of entries that are in each row,column combination. If output_var = "prop", then the output will be the wide table with the proportion as a percentage of entries in that row,column combination of the column.
prep_char_num_sort
1 2 3 4 5  | expl_cond_dist_tbl(main_var = c(1,1,1,2,1), cond_var = c(1,1,1,1,2), output_var="prop")
Output: Table with Col 1: (3/4 x 100 = 75, 1/4 x 100 = 25); Col 2: (1/1 x 100 = 100, 0/1 x 100 = 0)
i.e There are 4 positions with cond_var = 1. 3/4 of these have 1, 1/4 of these have 2 in main_var.
There is only 1 value with cond_var = 2. In main_var this is a 1.
-> 1 out of 1 for col "2" in row main_var = "1", 0 out of 1 for col "2" in row main_var ="2".
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.