computeSummaryStatistics: Compute summary statistics of interest of an unique variable...
In inTextSummaryTable: Creation of in-Text Summary Table

View source: R/computeSummaryStatistics.R

computeSummaryStatistics

R Documentation

Compute summary statistics of interest of an unique variable of interest.

Description

Additionally, this function run extra checks on the data:

an error message is triggered if any subject (identified by subjectVar) have different values in a continuous var
an indicative message is triggered if multiple but identical records are available for subjectVar and a continuous var

Usage

computeSummaryStatistics(
  data,
  var = NULL,
  varTotalInclude = FALSE,
  statsExtra = NULL,
  subjectVar = "USUBJID",
  filterEmptyVar = TRUE,
  type = "auto",
  checkVarDiffBySubj = c("error", "warning", "none"),
  msgLabel = NULL,
  msgVars = NULL
)

Arguments

`data`	Data.frame with dataset to consider for the summary table.
`var`	Character vector with variable(s) of `data`, to compute statistics on. If NULL (by default), counts by row/column variable(s) are computed. To also return counts of the `rowVar` in case other `var` are specified, you can include: 'all' in the `var`. Missing values, if present, are filtered (also for the report of number of subjects/records).
`varTotalInclude`	Logical (FALSE by default) Should the total across all categories of `var` be included for the count table? Only used if `var` is a categorical variable.
`statsExtra`	(optional) Named list with functions for additional custom statistics to be computed. Each function: has as parameter, either: 'x': the variable (`var`) to compute the summary statistic on or 'data': the entire dataset returns the corresponding summary statistic as a numeric vector For example, to additionally compute the coefficient of variation, this can be set to: `list(statCVPerc = function(x) sd(x)/mean(x)*100)` (or `cv`).
`subjectVar`	String, variable of `data` with subject ID, 'USUBJID' by default.
`filterEmptyVar`	Logical, if TRUE doesn't return any results if the variable is empty, otherwise return 0 for the counts and NA for summary statistics. Criterias to consider a variable empty are: for a continuous variable: all missing (NA) for a categorical variable: all missing or category is included in the factor levels but not available in `data` By default, an empty variable are filtered.
`type`	String with type of table: 'summaryTable': summary table with statistics for numeric variable 'countTable': count table 'auto' (by default): 'summaryTable' if the variable is numeric, 'countTable' otherwise
`checkVarDiffBySubj`	String, 'error' (default), 'warning', or 'none'. Should an error, a warning, or nothing be produced if a continuous variable (`var`) contains different values for the same subject?
`msgLabel`	(optional) String with label for the data (NULL by default), included in the message/warning for checks.
`msgVars`	(optional) Character vector with columns of `data` containing extra variables (besides `var` and `subjectVar`) that should be included in the message/warning for checks.

Value

Data.frame with summary statistics in columns, depending if type is:

'summary':
- 'statN': number of subjects
- 'statm': number of records
- 'statMean': mean of var
- 'statSD': standard deviation of var
- 'statSE': standard error the mean of var
- 'statMedian': median of var
- 'statMin': minimum of var
- 'statMax': maximum of var
'count':
- 'variableGroup': factor with groups of var for which counts are reported
- 'statN': number of subjects
- 'statm': number of records