knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library("cleaninginspectoR") library("knitr")
Here we create some fake data for illustration purposes. It is not important to understand this; we keep it in so you can run the example yourself if you like. The dataset contains:
a: random values and outliersuuid: values should be unique but are notwater.source.other: all NA except for twoGPS.lat just some numbers, but the column header indicates this is potentially sensitivetestdf <- data.frame(a= c(runif(98),7287,-100), b=sample(letters,100,T), uuid=c(1:98, 4,20), water.source.other = c(rep(NA,98),"neighbour's well","neighbour's well"), GPS.lat = runif(100) )
The function inspect_all runs all cleaning checks that are available.
inspect_all(testdf, uuid.column.name = "uuid")
kable(inspect_all(testdf, uuid.column.name = "uuid"))
One of the things inspect_all does is to look for duplicates in the first column containing the word "uuid". If your ID column has a different name, you can specify it in the second parameter:
inspect_all(df = testdf, uuid.column.name = "b")
kable(inspect_all(df = testdf, uuid.column.name = "b"))
For more information and individual check functions, see the detailed example.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.