computeSummaryStatistics: Compute summary statistics of interest of an unique variable...

View source: R/computeSummaryStatistics.R

computeSummaryStatisticsR Documentation

Compute summary statistics of interest of an unique variable of interest.

Description

Additionally, this function run extra checks on the data:

  • an error message is triggered if any subject (identified by subjectVar) have different values in a continuous var

  • an indicative message is triggered if multiple but identical records are available for subjectVar and a continuous var

Usage

computeSummaryStatistics(
  data,
  var = NULL,
  varTotalInclude = FALSE,
  statsExtra = NULL,
  subjectVar = "USUBJID",
  filterEmptyVar = TRUE,
  type = "auto",
  checkVarDiffBySubj = c("error", "warning", "none"),
  msgLabel = NULL,
  msgVars = NULL
)

Arguments

data

Data.frame with dataset to consider for the summary table.

var

Character vector with variable(s) of data, to compute statistics on.
If NULL (by default), counts by row/column variable(s) are computed.
To also return counts of the rowVar in case other var are specified, you can include: 'all' in the var.
Missing values, if present, are filtered (also for the report of number of subjects/records).

varTotalInclude

Logical (FALSE by default) Should the total across all categories of var be included for the count table? Only used if var is a categorical variable.

statsExtra

(optional) Named list with functions for additional custom statistics to be computed.
Each function:

  • has as parameter, either: 'x': the variable (var) to compute the summary statistic on or 'data': the entire dataset

  • returns the corresponding summary statistic as a numeric vector

For example, to additionally compute the coefficient of variation, this can be set to: list(statCVPerc = function(x) sd(x)/mean(x)*100) (or cv).

subjectVar

String, variable of data with subject ID, 'USUBJID' by default.

filterEmptyVar

Logical, if TRUE doesn't return any results if the variable is empty, otherwise return 0 for the counts and NA for summary statistics. Criterias to consider a variable empty are:

  • for a continuous variable: all missing (NA)

  • for a categorical variable: all missing or **category is included in the factor levels but not available in data**

By default, an empty variable are filtered.

type

String with type of table:

  • 'summaryTable': summary table with statistics for numeric variable

  • 'countTable': count table

  • 'auto' (by default): 'summaryTable' if the variable is numeric, 'countTable' otherwise

checkVarDiffBySubj

String, 'error' (default), 'warning', or 'none'. Should an error, a warning, or nothing be produced if a continuous variable (var) contains different values for the same subject?

msgLabel

(optional) String with label for the data (NULL by default), included in the message/warning for checks.

msgVars

(optional) Character vector with columns of data containing extra variables (besides var and subjectVar) that should be included in the message/warning for checks.

Value

Data.frame with summary statistics in columns, depending if type is:

  • 'summary':

    • 'statN': number of subjects

    • 'statm': number of records

    • 'statMean': mean of var

    • 'statSD': standard deviation of var

    • 'statSE': standard error the mean of var

    • 'statMedian': median of var

    • 'statMin': minimum of var

    • 'statMax': maximum of var

  • 'count':

    • 'variableGroup': factor with groups of var for which counts are reported

    • 'statN': number of subjects

    • 'statm': number of records

Author(s)

Laure Cougnaud


inTextSummaryTable documentation built on Sept. 12, 2023, 5:06 p.m.