Description Usage Arguments Details Value Author(s) See Also Examples

Takes in a data, and returns summary of continuous variables

1 |

`data` |
a data.frame or data.table |

It is of utmost importance to know the distribution of continuous variables in the
data. `dqcontinuous`

produces an output which tells - continuous variable,
non-missing values, missing values, percentage missing, minumum, average, maximum,
standard deviation, variance, common percentiles from 1 to 99, and number of outliers
for each continuous variable.

The function tags all integer and numeric variables as continuous, and produces output for them; if you think there are some variables which are integer or numeric in the data but they don't represent a continuous variable, change their type to an appropriate class.

`dqcontinuous`

uses the same criteria to identify outliers as the one used for
box plots. All values that are greater than 75th percentile value + 1.5 times the
inter quartile range or lesser than 25th percentile value - 1.5 times the inter
quartile range, are tagged as outliers.

This function works for both 'data.frame and 'data.table' but returns a 'data.frame' only.

a data.frame which contains the non-missing values, missing values, percentage of missing values, mimimum, mean, maximum, standard deviation, variance, percentiles and count of outliers of all integer and numeric variables

Akash Jain

`dqcategorical`

, `dqdate`

, `contents`

1 2 3 4 5 6 | ```
# A 'data.frame'
df <- data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
y = c(22, NA, 66, 12, 78, 34, 590, 97, 56, 37))
# Generate a data quality report of continuous variables
summaryContinuous <- dqcontinuous(data = df)
``` |

StatMeasures documentation built on May 30, 2017, 4:47 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.