surveyOutliers: Report the outlier values for all numerical field

Description Usage Arguments Value Author(s) Examples

Description

This function provide a report showing all outlier values for each numerical fields. The function will try to automatically determine the type of distribution (between Normal and Log-Normal) based on the difference between mean and median between untransformed normalized and log transformed normalized distribution.

Usage

1
2
3
4
5
6
7
surveyOutliers(
  ds = NULL,
  enumeratorID = NULL,
  sdval = 2,
  reportingColumns = c(enumeratorID, uniqueID),
  enumeratorCheck = FALSE
)

Arguments

ds

dataset containing the survey (from kobo): data.frame

enumeratorID

name of the field where the enumerator ID is stored: string

sdval

(Optional, by default set to 2) number of standard deviation for which the data within is considered as acceptable: integer

reportingColumns

(Optional, by default it is built from the enumeratorID and the UniqueID) name of the columns from the dataset you want in the result: list of string (c('col1','col2',...))

enumeratorCheck

(Optional, by default set to FALSE) specify if the report has to be displayed for each enumerator or not: boolean (TRUE/FALSE)

uniqueID

name of the field where the survey unique ID is stored: string

Value

dst same dataset as the inputed one but with survey marked for deletion if errors are found and delete=TRUE (or NULL)

ret_log list of the errors found (or NULL)

var a list of value (or NULL)

graph graphical representation of the results (or NULL)

Author(s)

Yannick Pascaud

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
{
ds <- HighFrequencyChecks::sample_dataset
enumeratorID <- "enumerator_id"
uniqueID <- "X_uuid"
reportingColumns <- c(enumeratorID, uniqueID)
sdval<-2

list[dst,ret_log,var,graph] <- surveyOutliers(ds=ds,
                                              enumeratorID=enumeratorID,
                                              sdval=sdval,
                                              reportingColumns=reportingColumns,
                                              enumeratorCheck=FALSE)
head(ret_log,10)
}

PYannick/HighFrequencyChecks documentation built on Dec. 31, 2020, 3:26 p.m.