NiceNumericCuts: Create factor variables from numeric variables using one of a...

View source: R/nicenumericcuts.R

NiceNumericCutsR Documentation

Create factor variables from numeric variables using one of a variety of methods.

Description

Create factor variables from numeric variables using one of a variety of methods.

Usage

NiceNumericCuts(
  input.data,
  method = c("tidy.intervals", "percentiles", "equal.width", "custom"),
  num.categories = 2,
  right = TRUE,
  round.input.data = FALSE,
  decimals = 1,
  label.decimals = 1,
  open.ends = TRUE,
  label.style = "tidy.labels",
  number.prefix = "",
  number.suffix = "",
  open.bottom.string = "Less than ",
  closed.bottom.string = " and below",
  open.top.string = "More than ",
  closed.top.string = " and over",
  equal.intervals.start = "",
  equal.intervals.end = "",
  equal.intervals.increment = "",
  custom.breaks = "",
  custom.always.includes.endpoints = FALSE,
  percents = "",
  quantile.type = 7,
  factors.use.labels = TRUE,
  grouping.mark = ",",
  decimals.mark = "."
)

Arguments

input.data

The data.frame or vector containing the data to be categorized. Data should be numeric.

method

A string describing which method is to be used to categorize the data. Options are tidy.intervals, percentiles, equal.width, and custom.

num.categories

An integer sepcifying the number of categories for the new factor. This does not apply for percentiles or for custom because in those cases the number of categories is implied by the percents and custom.breaks arguments respectively.

right

A boolean value specifying that when determining breaks in the data, the close side of the inerval should be on the right (i.e on the larger end of the interval).

round.input.data

A boolean value specifying whether the input data should be rounded before categorization.

decimals

An integer which determines to how many decimals the

label.decimals

An integer value which determines how many decimal places should be included in formatted labels.

open.ends

A boolean value which determines if labels at the upper and lower ends of the range should be open-ended or should contain both end points of the interval. E.g. if TRUE then interval (0, 17] would be "Less than 18", otherwise it would be "0 to 17".

label.style

A character indicating the style of labels to use for the new factor levels. Options are tidy.labels, inequality.notation, interval.notation, and percentiles, with the latter option only being applicable if method is also percentiles.

number.prefix

A character to be appended before numbers in new factor labels.

number.suffix

A character to be after before numbers in new factor labels.

open.bottom.string

A character indicating text to be placed at the beginning of open-ended labels at the start of the range for open intervals.

closed.bottom.string

A character indicating text to be placed at the end of open-ended labels at the start of the range for closed intervals.

open.top.string

A character indicating text to be placed at the beginning of open-ended labels at the end of the range for open intervals.

closed.top.string

A character indicating text to be placed at the end of open-ended labels at the end of the range for closed intervals.

equal.intervals.start

A numeric value indicating the start of the range when using method of equal.width.

equal.intervals.end

A numeric value indicating the end of the range when using method of equal.width.

equal.intervals.increment

A numeric value specifying the width of the increments when using method of equal.width. This value is optional, and overrides num.categories as the number of categories is determined using the start, end, and increment.

custom.breaks

A character containing a comma-seprated list of numeric values to be used as custom break points when the method is custom.

custom.always.includes.endpoints

A logical value indicating whether, when method is custom, the endpoints are always included. If FALSE, the endpoints for the cuts are determined by the custom.breaks only.

percents

A single numeric value, or a character containing a comma -separated list of numeric values to be used when the method of percentiles is used. Values should be between 0 and 100.

quantile.type

An interger between 1 and 9 to be passed to quantile which determines the algorithm for creating quantiles.

factors.use.labels

A logical value indicating whether numeric information should be extracted from factor labels. If FALSE the function will instead try to extract the underlying numerice values from Q/Displayr.

grouping.mark

A character to be used as the thousands-grouping mark when inferring numeric information from factor labels.

decimals.mark

A character to be used as the decimals-grouping mark when inferring numeric information from factor labels.

Value

A data frame containing the new factor variables as columns.


Displayr/flipTransformations documentation built on Feb. 26, 2024, 12:47 a.m.