Man pages for ELToulemonde/dataPreparation
Automated Data Preparation

adultAdult for UCI repository
aggregate_by_keyAutomatic data_set aggregation by key
as.POSIXct_fastFaster date transformation
build_binsCompute bins
build_date_factorDate Factor
build_encodingCompute encoding
build_scalesCompute scales
build_target_encodingBuild target encoding
compute_probability_ratioCompute probability ratio
compute_weight_of_evidenceCompute weight of evidence
data_preparation_newsShow the NEWS file
date_format_unifierUnify dates format
descriptionDescribe data set
fast_discretizationDiscretization
fast_filter_variablesFiltering useless variables
fast_handle_naHandle NA values
fast_is_equalFast checks of equality
fast_roundFast round
fast_scalescale
find_and_transform_datesIdentify date columns
find_and_transform_numericsIdentify numeric columns in a data_set set
generate_date_diffsDate difference
generate_factor_from_dateGenerate factor from dates
generate_from_characterRecode character
generate_from_factorRecode factor
get_most_frequent_elementGet most frequent element
identify_datesIdentify date columns
messy_adultAdult with some ugly columns added
one_hot_encoderOne hot encoder
prepare_setPreparation pipeline
remove_percentile_outlierPercentile outlier filtering
remove_rare_categoricalFilter rare categories
remove_sd_outlierStandard deviation outlier filtering
same_shapeGive same shape
set_as_numeric_matrixNumeric matrix preparation for Machine Learning.
set_col_as_characterSet columns as character
set_col_as_dateSet columns as POSIXct
set_col_as_factorSet columns as factor
set_col_as_numericSet columns as numeric
shape_setFinal preparation before ML algorithm
target_encodeTarget encode
tiny_messy_adultFirst 500 rows of 'messy_adult'
un_factorUnfactor factor with too many values
which_are_bijectionIdentify bijections
which_are_constantIdentify constant columns
which_are_includedIdentify columns that are included in others
which_are_in_doubleIdentify double columns
ELToulemonde/dataPreparation documentation built on July 19, 2023, 11:45 a.m.