Description Usage Arguments Value Examples

Subroutine called by `lassie`

. Discretizes, subsets and remove missing data from a data.frame.

1 | ```
preprocess(x, select, continuous, breaks, default_breaks = 4)
``` |

`x` |
data.frame or matrix. |

`select` |
optional vector of column numbers or column names specifying a subset of data to be used. By default, uses all columns. |

`continuous` |
optional vector of column numbers or column names specifying continuous variables that should be discretized. By default, assumes that every variable is categorical. |

`breaks` |
numeric vector or list passed on to |

`default_breaks` |
default break points for discretizations.
Same syntax as in |

List containing the following values:

raw: raw subsetted data.frame

pp: discretized, subsetted and complete data.frame

select

continuous

breaks

default_breaks

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
# This is what happens behind the curtains in the 'lassie' function
# Here we compute the association between the 'Girth' and 'Height' variables
# of the 'trees' dataset
# 'select' and 'continuous' take column numbers or names
select <- c('Girth', 'Height') # select subset of trees
continuous <-c(1, 2) # both 'Girth' and 'Height' are continuous
# equal-width discretization with 3 bins
breaks <- 3
# Preprocess data: subset, discretize and remove missing data
pre <- preprocess(trees, select, continuous, breaks)
# Estimates marginal and multivariate probabilities from preprocessed data.frame
prob <- estimate_prob(pre$pp)
# Computes local and global association using Ducher's Z
lam <- local_association(prob, measure = 'z')
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.