This is a simple example of a data profiling run.

Make the "project" data frame.

Which may have more than one row, but typically does not.

library(dProf)
dpProj <- dpMakeProject("Test", "A simple dprof test", "Jim P")
str(dpProj)

Make the "tables" data frame.

TblName <- "NewsCurrentEvents"
TblSource <- system.file("extdata", "NewsCurrentEvents.csv",
                         package = "dProf")
dpTables <- dpMakeTable(dpProj$project_id[1], TblName, TblSource, 
                       "Summarize email campaigns", "Jim P",
                       notes = "Using the example csv in project")
str(dpTables)

Make the "column properties" data frame.

dpProjID <- dpProj$project_id[1]
dpTblID <- 1
Tbl <- readr::read_csv(dpTables$table_source[dpTblID])
dpColProp <- dpColumnPropertiesR(dpProjID, dpTblID, Tbl)
str(dpColProp)

Make the "column plots" data frame.

dpColPlts <- dpColumnPlotsR(dpProjID, dpTblID, Tbl, dpColProp)
str(dpColPlts)

Combine profile data frames in .Rdata file

dpMeta <- "This data profile run data set built by ..."
dProfRun_path <- "C:/proj/dProf/"
dProfRun_name <- "dpRun.RData"
save(dpProj, dpTables, dpColProp, dpColPlts, dpMeta,
     file = paste0(dProfRun_path, dProfRun_name))

Run a data profiling report



ds4ci/dProf documentation built on May 15, 2019, 2:56 p.m.