library(ggplot2)
library(pander)
ggAggHist <- getFromNamespace("ggAggHist", "dataMaid")
ggAggBarplot <- getFromNamespace("ggAggBarplot", "dataMaid")

Data report overview

The dataset examined has the following dimensions:


Feature Result


Number of observations 100

Number of variables 2

Checks performed

The following variable checks were performed, depending on the data type of each variable:


  character factor labelled numeric integer logical Date


Identify miscoded missing values $\times$ $\times$ $\times$ $\times$ $\times$ $\times$

Identify prefixed and suffixed whitespace $\times$ $\times$ $\times$

Identify levels with < 6 obs. $\times$ $\times$ $\times$

Identify case issues $\times$ $\times$ $\times$

Identify misclassified numeric or integer variables $\times$ $\times$ $\times$

Identify outliers $\times$ $\times$ $\times$

Non-supported variable types were set to be handled in the following way:

Please note that all numerical values in the following have been rounded to 2 decimals.

Summary table


  Variable class # unique values Missing observations Any problems?


[complexVar] complex 100 0.00 %

[numericVar] integer 100 0.00 %

Variable list

complexVar

\bminione


Feature Result


Variable type complex

Number of missing obs. 0 (0 %)

Number of unique values 100

Median 50.5

1st and 3rd quartiles 25.75; 75.25

Min. and max. 1; 100

\emini \bminitwo

ggAggHist(data = structure(list(factorV = structure(1:20, .Label = c("[1,5.95]", 
"(5.95,10.9]", "(10.9,15.9]", "(15.9,20.8]", "(20.8,25.8]", "(25.8,30.7]", 
"(30.7,35.6]", "(35.6,40.6]", "(40.6,45.6]", "(45.6,50.5]", "(50.5,55.5]", 
"(55.5,60.4]", "(60.4,65.4]", "(65.4,70.3]", "(70.3,75.2]", "(75.2,80.2]", 
"(80.2,85.2]", "(85.2,90.1]", "(90.1,95]", "(95,100]"), class = "factor"), 
    Freq = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), xmin = c(1, 5.95, 10.9, 
    15.85, 20.8, 25.75, 30.7, 35.65, 40.6, 45.55, 50.5, 55.45, 
    60.4, 65.35, 70.3, 75.25, 80.2, 85.15, 90.1, 95.05), xmax = c(5.95, 
    10.9, 15.85, 20.8, 25.75, 30.7, 35.65, 40.6, 45.55, 50.5, 
    55.45, 60.4, 65.35, 70.3, 75.25, 80.2, 85.15, 90.1, 95.05, 
    100), ymin = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0), ymax = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L)), .Names = c("factorV", 
"Freq", "xmin", "xmax", "ymin", "ymax"), row.names = c(NA, -20L
), class = "data.frame"), vnam = "complexVar")

\emini

\fullline

numericVar

\bminione


Feature Result


Variable type integer

Number of missing obs. 0 (0 %)

Number of unique values 100

Median 50.5

1st and 3rd quartiles 25.75; 75.25

Min. and max. 1; 100

\emini \bminitwo

ggAggHist(data = structure(list(factorV = structure(1:20, .Label = c("[1,5.95]", 
"(5.95,10.9]", "(10.9,15.9]", "(15.9,20.8]", "(20.8,25.8]", "(25.8,30.7]", 
"(30.7,35.6]", "(35.6,40.6]", "(40.6,45.6]", "(45.6,50.5]", "(50.5,55.5]", 
"(55.5,60.4]", "(60.4,65.4]", "(65.4,70.3]", "(70.3,75.2]", "(75.2,80.2]", 
"(80.2,85.2]", "(85.2,90.1]", "(90.1,95]", "(95,100]"), class = "factor"), 
    Freq = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), xmin = c(1, 5.95, 10.9, 
    15.85, 20.8, 25.75, 30.7, 35.65, 40.6, 45.55, 50.5, 55.45, 
    60.4, 65.35, 70.3, 75.25, 80.2, 85.15, 90.1, 95.05), xmax = c(5.95, 
    10.9, 15.85, 20.8, 25.75, 30.7, 35.65, 40.6, 45.55, 50.5, 
    55.45, 60.4, 65.35, 70.3, 75.25, 80.2, 85.15, 90.1, 95.05, 
    100), ymin = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0), ymax = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
    5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L)), .Names = c("factorV", 
"Freq", "xmin", "xmax", "ymin", "ymax"), row.names = c(NA, -20L
), class = "data.frame"), vnam = "numericVar")

\emini

\fullline

Report generation information:



ekstroem/cleanR documentation built on Jan. 31, 2022, 8:58 a.m.