calculateStats: Calculate statistics for pairwise comparison of data sets

Description Usage Arguments Value Author(s)

View source: R/calculateStats.R

Description

Calculate a range of statistics and p-values for comparison of two data sets.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
calculateStats(
  df,
  ds1,
  ds2,
  column,
  subsampleSize,
  permute = FALSE,
  kmin,
  kfrac,
  xmin,
  xmax
)

Arguments

df

The input data frame. Must contain at least a column named 'dataset' and an additional column with values to use as the basis for the comparison.

ds1, ds2

The names of the two data sets to be compared.

column

The name of the column(s) of df to be used as the basis for the comparison.

subsampleSize

The number of observations for which certain time-consuming statistics will be calculated. The observations will be selected randomly among the rows of df.

permute

Whether to permute the dataset column of df before calculating the statistics.

kmin, kfrac

For statistics that require the extraction of k nearest neighbors of a given point, the number of neighbors will be max(kmin, kfrac * nrow(df)).

xmin, xmax

Smallest and largest value of column, used to normalize the x-axis when calculating the area between the eCDFs.

Value

A vector with statistics and p-values

Author(s)

Charlotte Soneson


countsimQC documentation built on Feb. 5, 2021, 2:02 a.m.