fcut: Transform data into a 'fsets' S3 class using shapes derived...

View source: R/fcut.R

fcutR Documentation

Transform data into a fsets S3 class using shapes derived from triangles or raised cosines

Description

This function creates a set of fuzzy attributes from crisp data. Factors, numeric vectors, matrix or data frame columns are transformed into a set of fuzzy attributes, i.e. columns with membership degrees. Unlike lcut(), for transformation is not used the linguistic linguistic approach, but partitioning using regular shapes of the fuzzy sets (such as triangle, raised cosine).

Usage

fcut(x, ...)

## Default S3 method:
fcut(x, ...)

## S3 method for class 'factor'
fcut(x, name = deparse(substitute(x)), ...)

## S3 method for class 'logical'
fcut(x, name = deparse(substitute(x)), ...)

## S3 method for class 'numeric'
fcut(
  x,
  breaks,
  name = deparse(substitute(x)),
  type = c("triangle", "raisedcos"),
  merge = 1,
  parallel = FALSE,
  ...
)

## S3 method for class 'data.frame'
fcut(
  x,
  breaks = NULL,
  name = NULL,
  type = c("triangle", "raisedcos"),
  merge = 1,
  parallel = FALSE,
  ...
)

## S3 method for class 'matrix'
fcut(x, ...)

Arguments

x

Data to be transformed: a vector, matrix, or data frame. Non-numeric data are allowed.

...

Other parameters to some methods.

name

A name to be added as a suffix to the created fuzzy attribute names. This parameter can be used only if x is a vector. If x is a matrix or data frame, name should be NULL because the fuzzy attribute names are taken from column names of the argument x.

breaks

This argument determines the break-points of the positions of the fuzzy sets (see also equidist(). It should be an ordered vector of numbers such that the i-th index specifies the beginning, (i+1)-th the center, and (i+2)-th the ending of the i-th fuzzy set.

I.e. the minimum number of breaks-points is 3; n-2 elementary fuzzy sets would be created for n break-points.

If considering an i-th fuzzy set (of type='triangle'), x values lower than i-th break (and greater than (i+2)-th break) would result in zero membership degree, values equal to (i+1)-th break would have membership degree equal 1 and values between them the appropriate membership degree between 0 and 1.

The resulting fuzzy sets would be named after the original data by adding dot (".") and a number i of fuzzy set.

Unlike base::cut(), x values, that are lower or greater than the given break-points, will have all membership degrees equal to zero.

For non-numeric data, this argument is ignored. For x being a numeric vector, it must be a vector of numeric values. For x being a numeric matrix or data frame, it must be a named list containing a numeric vector for each column - if not, the values are repeated for each column.

type

The type of fuzzy sets to create. Currently, 'triangle' or 'raisedcos' may be used. The type argument may be also a function with 3 or 4 arguments:

  • if type is a 4-argument function, it is assumed that that it computes membership degrees from values of the first argument while considering the boundaries given by the next 3 arguments;

  • if type is a 3-argument function, it is assumed that it is a factory function similar to triangular() or raisedcosine(), which, from given three boundaries, creates a function that computes membership degrees.

merge

This argument determines whether to derive additional fuzzy sets by merging the elementary fuzzy sets (whose position is determined with the breaks argument) into super-sets. The argument is ignored for non-numeric data in x.

merge may contain any integer number from 1 to length(breaks) - 2. Value 1 means that the elementary fuzzy sets should be present in the output. Value 2 means that the two consecutive elementary fuzzy sets should be combined by using the Lukasiewic t-conorm, value 3 causes combining three consecutive elementary fuzzy sets etc.

The names of the derived (merged) fuzzy sets is derived from the names of the original elementary fuzzy sets by concatenating them with the "|" (pipe) separator.

parallel

Whether the processing should be run in parallel or not. Parallelization is implemented using the foreach::foreach() function. The parallel environment must be set properly in advance, e.g. with the doMC::registerDoMC() function. Currently this argument is applied only if x is a matrix or data frame.

Details

The aim of this function is to transform numeric data into a set of fuzzy attributes. The result is in the form of the object of class "fsets", i.e. a numeric matrix whose columns represent fuzzy sets (fuzzy attributes) with values being the membership degrees.

The function behaves differently to the type of input x.

If x is a factor or a logical vector (or other non-numeric data) then for each distinct value of an input, a fuzzy set is created, and data would be transformed into crisp membership degrees 0 or 1 only.

If x is a numeric vector then fuzzy sets are created accordingly to break-points specified in the breaks argument with 1st, 2nd and 3rd break-point specifying the first fuzzy set, 2nd, 3rd and 4th break-point specifying th second fuzzy set etc. The shape of the fuzzy set is determined by the type argument that may be equal either to a string 'triangle' or 'raisedcos' or it could be a function that computes the membership degrees for itself (see triangular() or raisedcosine() functions for details). Additionally, super-sets of these elementary sets may be created by specifying the merge argument. Values of this argument specify how many consecutive fuzzy sets should be combined (by using the Lukasiewic's t-conorm) to produce super-sets - see the description of merge above.

If a matrix (resp. data frame) is provided to this function instead of single vector, all columns are processed separately as described above and the result is combined with the cbind.fsets() function.

The function sets up properly the vars() and specs() properties of the result.

Value

An object of class "fsets" is returned, which is a numeric matrix with columns representing the fuzzy attributes. Each source column of the x argument corresponds to multiple columns in the resulting matrix. Columns have names that indicate the name of the source as well as a index i of fuzzy set(s) – see the description of arguments breaks and merge above.

The resulting object would also have set the vars() and specs() properties with the former being created from original column names (if x is a matrix or data frame) or the name argument (if x is a numeric vector). The specs() incidency matrix would be created to reflect the superset-hood of the merged fuzzy sets.

Author(s)

Michal Burda

See Also

lcut(), equidist(), farules(), pbld(), vars(), specs(), cbind.fsets()

Examples


# fcut on non-numeric data
ff <- factor(substring("statistics", 1:10, 1:10), levels = letters)
fcut(ff)

# transform a single vector into a single fuzzy set
x <- runif(10)
fcut(x, breaks=c(0, 0.5, 1), name='age')

# transform single vector into a partition of the interval 0-1
# (the boundary triangles are right-angled)
fcut(x, breaks=c(0, 0, 0.5, 1, 1), name='age')

# also create supersets
fcut(x, breaks=c(0, 0, 0.5, 1, 1), name='age', merge=c(1, 2))

# transform all columns of a data frame
# with different breakpoints
data <- CO2[, c('conc', 'uptake')]
fcut(data, breaks=list(conc=c(95, 95, 350, 1000, 1000),
                       uptake=c(7, 7, 28.3, 46, 46)))

# using a custom 3-argument function (a function factory):
f <- function(a, b, c) {
  return(function(x) ifelse(a <= x & x <= b, 1, 0))
}
fcut(x, breaks=c(0, 0.5, 1), name='age', type=f)

# using a custom 4-argument function:
f <- function(x, a, b, c) {
  return(ifelse(a <= x & x <= b, 1, 0))
}
fcut(x, breaks=c(0, 0.5, 1), name='age', type=f)


lfl documentation built on Oct. 30, 2024, 9:27 a.m.