View source: R/remove_outliers.R
remove_outliers | R Documentation |
This function takes in a selected metric and uses z-score (number of standard deviations) to identify and remove outlier weeks for individuals across time. There are applications in this for removing weeks with abnormally low collaboration activity, e.g. holidays. Retains metrics with z > -2.
Function is based on identify_outlier()
, but implements a more elaborate
approach as the outliers are identified and removed with respect to each
individual, as opposed to the group. Note that remove_outliers()
has a
longer runtime compared to identify_outlier()
.
remove_outliers(data, metric = "Collaboration_hours")
data |
A Standard Person Query dataset in the form of a data frame. |
metric |
Character string containing the name of the metric, e.g. "Collaboration_hours" |
For mature functions to remove common outliers, please see the following:
identify_holidayweeks()
identify_nkw()
identify_inactiveweeks
Returns a new data frame, "cleaned_data" with all metrics, having removed the person-weeks that are below 2 standard deviations of each individual's collaboration activity.
Mark Powers mark.powers@microsoft.com
Other Data Validation:
check_query()
,
extract_hr()
,
flag_ch_ratio()
,
flag_em_ratio()
,
flag_extreme()
,
flag_outlooktime()
,
hr_trend()
,
hrvar_count()
,
hrvar_count_all()
,
hrvar_trend()
,
identify_churn()
,
identify_holidayweeks()
,
identify_inactiveweeks()
,
identify_nkw()
,
identify_outlier()
,
identify_privacythreshold()
,
identify_query()
,
identify_shifts()
,
identify_shifts_wp()
,
identify_tenure()
,
standardise_pq()
,
subject_validate()
,
subject_validate_report()
,
track_HR_change()
,
validation_report()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.