An important task and skill in data cleaning is to get to "know" your data. This process takes considerable time and often the resulting knowledge is fragmented. Fragmented in the individual observations that are different from implicit expectations, leading to anecdotical knowledge, but also fragmented across members of a data science team.
An alternative approach is to develop an explicit set of data validation rules. Rules that state when data is valid for your analysis. This approach is part of the validate
package suite.
Many of the validate
related packages do not consider the process of creating the rule sets, they simply assume the rules are available and can be used to check, correct or impute your data.
deriverules
is about creating data validation rule set for new data sets: it aims to help to bootstrap the rule finding process.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.