emcdf: Computes multivariate empirical joint distribution

Description Usage Arguments Details Value Examples

Description

This function computes empirical joint distribution (joint CDF) with single/ multi-thread.

Usage

1
emcdf(data, a)

Arguments

data

a numeric matrix stores data. Or an S4 object of class "emcdf_obj".

a

a numeric vector or matrix of parameters for CDF function.

Details

When data is a numeric matrix, this function computes joint empirical CDF with single thread. When data is an object of class "emcdf_obj", it computes with multi-thread. Parameter "a" must have equal length (or equal column number) as the column number of data. Both single-thread and multi-thread emcdf algorithms are faster than using the bulit-in function sum{base}. See example for simulation. Note that initializing threads and spliting data takes time though it's a one-time task. Thus for big data, big number of CDF computation, multi-thread is recommended. Yet for small data, small number of CDF computation, single thread is faster.

Value

a numeric (vector) as value(s) of empirical joint CDF function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
n = 10^6
set.seed(123)
x = rnorm(n)
y = rnorm(n)
z = rnorm(n)
data = cbind(x, y, z)
#The aim is to compute F(0.5,0.5,0.5) with three
#approaches and compare the performances.
#To avoid CPU noises, we repeat the computation 10 times.
#compute with R built-in function, sum()
sum_time = system.time({
  aws1 = c()
  for(i in 1:10)
    aws1[i] = sum(x <= 0.5& y <=0.5& z <=0.5)/n
})[3]

#compute with emcdf single-thread
a = matrix(rep(c(0.5, 0.5, 0.5), 10), 10, 3)
single_time = system.time({
   aws2 = emcdf(data, a)
})[3]

obj = initF(data, 4)
multi_time = system.time({
   aws3 = emcdf(obj, a)
})[3]
aws2 == aws1
aws3 == aws1
sum_time
single_time
multi_time

Emcdf documentation built on May 2, 2019, 1:47 p.m.

Related to emcdf in Emcdf...