cross: Describe everything

Description Usage Arguments Value Note Author(s) See Also Examples

View source: R/remix.r

Description

A quick and easy function for describing datasets.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
cross(formula = cbind(...) ~ ., data = NULL, funs = c(` ` =
  mysummary), ..., margin = 0:2, total = FALSE, digits = 2,
  showNA = c("no", "ifany", "always"), method = c("pearson", "kendall",
  "spearman"), times = NULL, followup = FALSE, test = FALSE,
  test.summarize = test.summarize.auto,
  test.survival = test.survival.logrank,
  test.tabular = test.tabular.auto, show.test = display.test,
  plim = 4, show.method = TRUE, effect = FALSE,
  effect.summarize = effect.diff.mean.auto,
  effect.tabular = effect.or.row.by.col,
  effect.survival = effect.survival.coxph, conf.level = 0.95,
  label = FALSE, regroup = FALSE)

Arguments

formula

a formula (see Details).

data

a data.frame.

funs

Functions used for describing numeric variables. Vector (named or not): c(fun1, fun2, fun3) or c("fun1", "fun2", "fun3").

...

further arguments, all passed to funs. For example na.rm = TRUE

margin

index, or vector of indices to indicate which proportions should be computed in frequency tables (0: cell, 1: row, 2: col).

total

whether to add margins. Integers (c(1, 2): both row and col margins, 1: row margins, 2: col margins, 0: no margins) or logical (TRUE: row and col margins, FALSE: no margins).

digits

number of digits

showNA

whether to show NA (c("no", "ifany", "always" like in table())

method

a character string indicating which correlation coefficient is to be used. One of "pearson", "kendall", or "spearman", can be abbreviated.

times

vector of times (see ?summary.survival in package survival).

followup

whether to display follow-up time.

test

whether to perform tests

test.summarize

a function of two arguments (continuous variable and grouping variable), used to compare continuous variable. Returns a list of two components : p.value and method (the test name). See test.summarize.auto, test.summarize.kruskal, test.summarize.oneway.equalvar, or test.summarize.unequalvar for some examples of such functions. Users can provide their own function.

test.survival

a function of one argument (a formula), used to compare survival estimations. Returns the same components as created by test.summarize. See test.survival.logrank. Users can provide their own function.

test.tabular

a function of two arguments (two categorical variables), used to test association between two factors. Returns the same components as created by test.summarize. See test.tabular.auto and test.tabular.fisher. Users can provide their own function.

show.test

function used to display the test result. See display.test.

plim

number of digits for the p value

show.method

wether to display the test name (logical)

effect

whether to compute a effect measure

effect.summarize

a function of three arguments (continuous variable, grouping variable and conf.level), used to compare continuous variable. Returns a list of five components : effect (the effect value(s)), ci (the matrix of confidence interval(s)), effect.name (the interpretiation(s) of the effect value(s)), effect.type (the description of the measure used) and conf.level (the confidence interval level). See effect.diff.mean.auto, effect.diff.mean.student or diff.mean.boot for some examples of such functions. Users can provide their own function.

effect.tabular

a function of three arguments (two categorical variables and conf.level) used to measure the associations between two factors. Returns a list of five components : effect (the effect value(s)), ci (the matrix of confidence interval(s)), effect.name (the interpretiation(s) of the effect value(s)), effect.type (the description of the measure used) and conf.level (the confidence interval level). See effect.or.row.by.col, effect.rr.row.by.col, effect.rd.row.by.col, effect.or.col.by.row, effect.rr.col.by.row, or effect.rd.col.by.row for some examples of such functions. Users can provide their own function.

effect.survival

a function of two argument (a formula and conf.level), used to measure the association between a consored and a factor. Returns the same components as created by effect.summarize. See effect.survival.coxph. Users can provide their own function.

conf.level

The desired confidence interval level

label

whether to display labels of variables)

regroup

whether to regroup numerics with numerics and factors with factors in cbind (logical)

Value

A data.frame, or a list of data.frames.

Note

The formula has the following format: x_1 + x_2 + ... ~ y_1 + y_2 + ...

There are a couple of special variables: ... represents all other variables not used in the formula and . represents no variable, so you can do formula = var1 ~ ..

If var1 is numeric, var1 ~ . produce a summary table using funs. If var1 is a factor, var1 ~ . produce a frequency table. If var1 is of class Surv, var1 ~ . produce a table with the estimates of survival at times. If var1 is numeric and var2 is numeric, var1 ~ var2 produces a correlation correlation coefficient. if var1 is numeric and var2 is a factor, var1 ~ var2 produce a summary table (using functions in funs) according to the levels of var2. If var1 is a factor and var2 is a factor, var1 ~ var2 produce a contingency table. If var1 is of class Surv and var2 is a factor, var1 ~ var2 produce a table with the estimates of survival for each level of var2.

You can group several variables together with cbind(var1, var2, var3): var1, var2 and var3 will be grouped in the same table. cbind(...) works (ie regroups all variables of the data.frame together). When a cbind is in both sides of the formula, cross will do its best to group everything in the same table, but only if it is possible...

Author(s)

David Hajage, inspired by the design and the code of summary.formula (Hmisc package, FE Harrell) and cast (reshape package, H Wickham).

See Also

cast (reshape) and summary.formula (Hmisc).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library(biostat2)
cross(data = iris)
cross(cbind(...) ~ ., iris[, sapply(iris, is.numeric)], funs = c(median, mad, min, max))
cross(cbind(Sepal.Length, I(Sepal.Width^2)) ~ Species, iris, funs = quantile, probs = c(1/3, 2/3))
cross(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width, iris)
cross(cbind(Sepal.Length, Sepal.Width) ~ cbind(Petal.Length, Petal.Width), iris)
cross(... ~ ., esoph)
cross(alcgp ~ tobgp, esoph, test = TRUE)
library(survival)
cross(Surv(time, status) ~ x, data = aml)

eusebe/biostat2 documentation built on Dec. 27, 2019, 4:22 p.m.