rthtable: Parallel Computation of Contingency Tables

Description Usage Arguments Details Value Examples

View source: R/rthtable.R

Description

Similar to R's table, but with important differences.

Usage

1
2
rthtable(m, lb, ub, varnames = NULL, dnn = NULL,
  nthreads = rth.nthreads(), nch = nthreads)

Arguments

m

Data matrix, one row per observation.

lb

Vector of lower bounds on the variables.

ub

Vector of upper bounds on the variables.

varnames

Character vector of names of the variables.

dnn

List of names of the levels of the variables.

nthreads

An optional argument for specifying the number of threads (in the case of OpenMP or TBB backends) that the routine should (at most) use. Does nothing with a CUDA backend. See nthreads.

nch

Number of chunks for partitioning the data.

Details

The function rthtable() is similar to R's table(). It allows more cells than table(), and is much faster. However, unlike table(), here users must specify the ranges of the variables in advance.

The function arylin2mult() is handy for tables of high dimension. For example, one may be interested in searching for outliers, and thus consider cells of small sizes, say less than 5. We can apply which() to the table, then use arylin2mult() to convert the resulting linear indices to multidimensional ones.

Value

The function rthtable() returns an object of R class table.

The function arylin2mult() returns a matrix of multidimensional indices, one row for each element of lins.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Not run: 
library(MASS)
pm <- Pima.te
# cut diabetic pedigree, age into 3 ranges
pm$ped1 <- cut(pm$ped,3,1:3)
pm$age1 <- cut(pm$age,3,1:3)
# for diabetes, recode Yes/No at 1/0
pm$type1 <- as.integer(pm$type == 'Yes')
# names of the levels
dnn <- list(c("low risk","med risk","high risk"),c("young","middle age",
   "senior"),c("no","yes"))
tbl <- rthtable(pm[,9:11],c(1,1,0),c(3,3,1),dnn=dnn)
tbl  # display the table

# which cells are rate (clear visually here, less so with many vars)
tbli <- as.integer(tbl)  # pure cell counts
arylin2mult(which(tbli < 5),c(1,1,0),c(3,3,1))
# e.g. output shows one small cell is (3,1,0), i.e. high risk/young/no


## End(Not run)

matloff/Rth documentation built on May 21, 2019, 12:55 p.m.