View source: R/sdc_descriptives.R

sdc_descriptives | R Documentation |

Checks the number of distinct entities and the (n, k) dominance rule for your descriptive statistics.

That means that `sdc_descriptives()`

checks if there are at least 5
distinct entities and if the largest 2 entities account for 85% or more of
`val_var`

. The parameters can be changed using options. For details see
`vignette("options", package = "sdcLog")`

.

sdc_descriptives( data, id_var = getOption("sdc.id_var"), val_var = NULL, by = NULL, zero_as_NA = NULL, fill_id_var = FALSE )

`data` |
data.frame from which the descriptive statistics are calculated. |

`id_var` |
character The name of the id variable. Defaults to |

`val_var` |
character vector of value variables on which descriptive statistics are computed. |

`by` |
character vector of grouping variables. |

`zero_as_NA` |
logical If TRUE, zeros in 'val_var' are treated as NA. |

`fill_id_var` |
logical Only for very specific use cases. For example: -
`id_var` contains`NA` values which represent missing values in the sense that there actually exist values identifying the entity but are unknown (or deleted for privacy reasons). -
`id_var` contains`NA` values which result from the fact that an observation features more than one confidential identifier and not all of these identifiers are present in each observation. Examples for such identifiers are the role of a broker in a security transaction or the role of a collateral giver in a credit relationship.
If Defaults to |

The general form of the \mjseqn(n, k) dominance rule can be formulated as:

\mjsdeqn\sum_i=1^nx_i > \frack100 \sum_i=1^Nx_i

where \mjseqnx_1 \ge x_2 \ge \cdots \ge x_N. \mjseqnn denotes the number of largest contributions to be considered, \mjseqnx_n the \mjseqnn-th largest contribution, \mjseqnk the maximal percentage these \mjseqnn contributions may account for, and \mjseqnN is the total number of observations.

If the statement above is true, the \mjseqn(n, k) dominance rule is violated.

A list of class `sdc_descriptives`

with detailed information about
options, settings, and compliance with the criteria distinct entities and
dominance.

sdc_descriptives( data = sdc_descriptives_DT, id_var = "id", val_var = "val_1" ) sdc_descriptives( data = sdc_descriptives_DT, id_var = "id", val_var = "val_1", by = "sector" ) sdc_descriptives( data = sdc_descriptives_DT, id_var = "id", val_var = "val_1", by = c("sector", "year") ) sdc_descriptives( data = sdc_descriptives_DT, id_var = "id", val_var = "val_2", by = c("sector", "year") ) sdc_descriptives( data = sdc_descriptives_DT, id_var = "id", val_var = "val_2", by = c("sector", "year"), zero_as_NA = FALSE )

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.