Home

/

CRAN

/

GenOrd

/

ordsample: Drawing a sample of discrete data

ordsample: Drawing a sample of discrete data
In GenOrd: Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

Description Usage Arguments Value Author(s) See Also Examples

View source: R/ordsample.R

The function draws a sample from a multivariate discrete variable with correlation matrix Sigma and prescribed marginal distributions marginal

1 2	ordsample(n, marginal, Sigma, support = list(), Spearman = FALSE, cormat = "discrete")

`n`	the sample size
`marginal`	a list of k elements, where k is the number of variables. The i-th element of `marginal` is the vector of the cumulative probabilities defining the marginal distribution of the i-th component of the multivariate variable. If the i-th component can take k_i values, the i-th element of marginal will contain k_i-1 probabilities (the k_i-th is obviously 1 and shall not be included).
`Sigma`	the target correlation matrix of the multivariate discrete variable
`support`	a list of k elements, where k is the number of variables. The i-th element of `support` is the vector containing the ordered values of the support of the i-th variable. By default, the support of the i-th variable is 1,2,...,k_i
`Spearman`	if `TRUE`, the function finds Spearman's correlations (and it is not necessary to provide `support`), if `FALSE` (default) Pearson's correlations
`cormat`	`"discrete"` if the `Sigma` in input is the target correlation matrix of the multivariate discrete variable; `"continuous"` if the `Sigma` in input is the intermediate correlation matrix of the multivariate standard normal

a n\times k matrix of values drawn from the k-variate discrete r.v. with the desired marginal distributions and correlation matrix

Alessandro Barbiero, Pier Alda Ferrari

contord, ordcont, corrcheck

# Example 1

# draw a sample from a bivariate ordinal variable
# with 4 of categories and asymmetrical marginal distributions
# and correlation coefficient 0.6 (to be checked)
k <- 2
marginal <- list(c(0.1,0.3,0.6), c(0.4,0.7,0.9))
corrcheck(marginal) # check ok
Sigma <- matrix(c(1,0.6,0.6,1),2,2)
# sample size 1000
n <- 1000
# generate a sample of size n
m <- ordsample(n, marginal, Sigma)
head(m)
# sample correlation matrix
cor(m) # compare it with Sigma
# empirical marginal distributions
cumsum(table(m[,1]))/n
cumsum(table(m[,2]))/n # compare them with the two marginal distributions

# Example 1bis

# draw a sample from a bivariate ordinal variable
# with 4 of categories and asymmetrical marginal distributions
# and Spearman correlation coefficient 0.6 (to be checked)
k <- 2
marginal <- list(c(0.1,0.3,0.6), c(0.4,0.7,0.9))
corrcheck(marginal, Spearman=TRUE) # check ok
Sigma <- matrix(c(1,0.6,0.6,1),2,2)
# sample size 1000
n <- 1000
# generate a sample of size n
m <- ordsample(n, marginal, Sigma, Spearman=TRUE)
head(m)
# sample correlation matrix
cor(rank(m[,1]),rank(m[,2])) # compare it with Sigma
# empirical marginal distributions
cumsum(table(m[,1]))/n
cumsum(table(m[,2]))/n # compare them with the two marginal distributions

# Example 1ter

# draw a sample from a bivariate random variable
# with binomial marginal distributions (n=3, p=1/3 and n=4, p=2/3)
# and Pearson correlation coefficient 0.6 (to be checked)
k <- 2
marginal <- list(pbinom(0:2, 3, 1/3),pbinom(0:3, 4, 2/3))
marginal
corrcheck(marginal, support=list(0:3, 0:4)) # check ok
Sigma <- matrix(c(1,0.6,0.6,1),2,2)
# sample size 1000
n <- 1000
# generate a sample of size n
m <- ordsample(n, marginal, Sigma, support=list(0:3,0:4))
head(m)
# sample correlation matrix
cor(m) # compare it with Sigma
# empirical marginal distributions
cumsum(table(m[,1]))/n
cumsum(table(m[,2]))/n # compare them with the two marginal distributions

# Example 2

# draw a sample from a 4-dimensional ordinal variable
# with different number of categories and uniform marginal distributions
# and different correlation coefficients
k <- 4
marginal <- list(0.5, c(1/3,2/3), c(1/4,2/4,3/4), c(1/5,2/5,3/5,4/5))
corrcheck(marginal)
# select a feasible correlation matrix
Sigma <- matrix(c(1,0.5,0.4,0.3,0.5,1,0.5,0.4,0.4,0.5,1,0.5,0.3,0.4,0.5,1),
4, 4, byrow=TRUE)
Sigma
# sample size 100
n <- 100
# generate a sample of size n
set.seed(1)
m <- ordsample(n, marginal, Sigma)
# sample correlation matrix
cor(m) # compare it with Sigma
# empirical marginal distribution
cumsum(table(m[,4]))/n # compare it with the fourth marginal
head(m)
# or equivalently...
set.seed(1)
res <- ordcont(marginal, Sigma)
res[[1]] # the intermediate correlation matrix of the multivariate normal
m <- ordsample(n, marginal, res[[1]], cormat="continuous")
head(m)

# Example 3
# simulation of two correlated Poisson r.v.
# modification to GenOrd sampling function for Poisson distribution
ordsamplep<-function (n, lambda, Sigma) 
{
   k <- length(lambda)
   valori <- mvrnorm(n, rep(0, k), Sigma)
   for (i in 1:k)
   {
   valori[, i] <- qpois(pnorm(valori[,i]), lambda[i])
   }
   return(valori)
}
# number of variables
k <- 2
# Poisson parameters
lambda <- c(2, 5)
# correlation matrix
Sigma <- matrix(0.25, 2, 2)
diag(Sigma) <- 1
# sample size
n <- 10000
# preliminar stage: support TRUNCATION
# required for recovering the correlation matrix
# of the standard bivariate normal
# truncation error
epsilon <- 0.0001
# corresponding maximum value
kmax <- qpois(1-epsilon, lambda)
# truncated marginals
l <- list()
for(i in 1:k)
{
l[[i]] <- 0:kmax[i]
}
marg <- list()
for(i in 1:k)
{
marg[[i]] <- dpois(0:kmax[i],lambda[i])
marg[[i]][kmax[i]+1] <- 1-sum(marg[[i]][1:(kmax[i])])
}
cm <- list()
for(i in 1:k)
{
cm[[i]] <- cumsum(marg[[i]])
cm[[i]] <- cm[[i]][-(kmax[i]+1)]
}
# check feasibility of correlation matrix
RB <- corrcheck(cm, support=l)
RL <- RB[[1]]
RU <- RB[[2]]
Sigma <= RU & Sigma >= RL # OK
res <- ordcont(cm, Sigma, support=l)
res[[1]]
Sigma <- res[[1]]
# draw the sample
m <- ordsamplep(n, lambda, Sigma)
# sample correlation matrix
cor(m)
head(m)

# Example 4
# simulation of 4 correlated binary and Poisson r.v.'s (2+2)
# modification to GenOrd sampling function
ordsamplep <- function (n, marginal, lambda, Sigma) 
{
k <- length(lambda)
valori <- mvrnorm(n, rep(0, k), Sigma)
for(i in 1:k)
{
if(lambda[i]==0)
{
	valori[, i] <- as.integer(cut(valori[, i], breaks = c(min(valori[,i]) - 1,
               qnorm(marginal[[i]]), max(valori[, i]) + 1)))
valori[, i] <- support[[i]][valori[, i]]
}
else
{
	valori[, i] <- qpois(pnorm(valori[,i]), lambda[i])
}
	}
return(valori)
}
# number of variables
k <- 4
# Poisson parameters (only 3rd and 4th are Poisson)
lambda <- c(0, 0, 2, 5)
# 1st and 2nd are Bernoulli with p=0.5
marginal <- list()
marginal[[1]] <- .5
marginal[[2]] <- .5
marginal[[3]] <- 0
marginal[[4]] <- 0
# support
support <- list()
support[[1]] <- 0:1
support[[2]] <- 0:1
# correlation matrix
Sigma <- matrix(0.25, k, k)
diag(Sigma) <- 1
# sample size
n <- 10000
# preliminar stage: support TRUNCATION
# required for recovering the correlation matrix
# of the standard bivariate normal
# truncation error
epsilon <- 0.0001
# corresponding maximum value
kmax <- qpois(1-epsilon, lambda)
# truncated marginals
for(i in 3:4)
{
support[[i]] <- 0:kmax[i]
}
marg <- list()
for(i in 3:4)
{
marg[[i]] <- dpois(0:kmax[i],lambda[i])
marg[[i]][kmax[i]+1] <- 1-sum(marg[[i]][1:(kmax[i])])
}
for(i in 3:4)
{
marginal[[i]] <- cumsum(marg[[i]])
marginal[[i]] <- marginal[[i]][-(kmax[i]+1)]
}
# check feasibility of correlation matrix
RB <- corrcheck(marginal, support=support)
RL <- RB[[1]]
RU <- RB[[2]]
Sigma <= RU & Sigma >= RL # OK
# compute correlation matrix of the 4-variate standard normal
res <- ordcont(marginal, Sigma, support=support)
res[[1]]
Sigma <- res[[1]]
# draw the sample
m <- ordsamplep(n, marginal, lambda, Sigma)
# sample correlation matrix
cor(m)
head(m)

Loading required package: mvtnorm
Loading required package: Matrix
Loading required package: MASS
[[1]]
2 x 2 Matrix of class "dsyMatrix"
     [,1] [,2]
[1,]    1   -1
[2,]   -1    1

[[2]]
2 x 2 Matrix of class "dsyMatrix"
     [,1] [,2]
[1,]  1.0  0.8
[2,]  0.8  1.0

     [,1] [,2]
[1,]    4    3
[2,]    4    3
[3,]    4    2
[4,]    3    1
[5,]    2    1
[6,]    3    2
          [,1]      [,2]
[1,] 1.0000000 0.6069379
[2,] 0.6069379 1.0000000
    1     2     3     4 
0.106 0.299 0.596 1.000 
    1     2     3     4 
0.377 0.689 0.898 1.000 
[[1]]
2 x 2 Matrix of class "dsyMatrix"
     [,1] [,2]
[1,]    1   -1
[2,]   -1    1

[[2]]
2 x 2 Matrix of class "dsyMatrix"
          [,1]      [,2]
[1,] 1.0000000 0.8733333
[2,] 0.8733333 1.0000000

     [,1] [,2]
[1,]    3    3
[2,]    1    1
[3,]    2    1
[4,]    3    3
[5,]    3    1
[6,]    2    1
[1] 0.5921783
    1     2     3     4 
0.106 0.294 0.605 1.000 
    1     2     3     4 
0.403 0.713 0.904 1.000 
[[1]]
[1] 0.2962963 0.7407407 0.9629630

[[2]]
[1] 0.01234568 0.11111111 0.40740741 0.80246914

[[1]]
2 x 2 Matrix of class "dsyMatrix"
           [,1]       [,2]
[1,]  1.0000000 -0.8660254
[2,] -0.8660254  1.0000000

[[2]]
2 x 2 Matrix of class "dsyMatrix"
          [,1]      [,2]
[1,] 1.0000000 0.8499879
[2,] 0.8499879 1.0000000

     [,1] [,2]
[1,]    0    2
[2,]    2    3
[3,]    0    2
[4,]    2    3
[5,]    1    2
[6,]    0    3
          [,1]      [,2]
[1,] 1.0000000 0.5986331
[2,] 0.5986331 1.0000000
    0     1     2     3 
0.310 0.739 0.957 1.000 
    0     1     2     3     4 
0.010 0.118 0.419 0.811 1.000 
[[1]]
4 x 4 Matrix of class "dsyMatrix"
           [,1]       [,2]       [,3]       [,4]
[1,]  1.0000000 -0.8164966 -0.8944272 -0.8485281
[2,] -0.8164966  1.0000000 -0.9128709 -0.9237604
[3,] -0.8944272 -0.9128709  1.0000000 -0.9486833
[4,] -0.8485281 -0.9237604 -0.9486833  1.0000000

[[2]]
4 x 4 Matrix of class "dsyMatrix"
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.8164966 0.8944272 0.8485281
[2,] 0.8164966 1.0000000 0.9128709 0.9237604
[3,] 0.8944272 0.9128709 1.0000000 0.9486833
[4,] 0.8485281 0.9237604 0.9486833 1.0000000

     [,1] [,2] [,3] [,4]
[1,]  1.0  0.5  0.4  0.3
[2,]  0.5  1.0  0.5  0.4
[3,]  0.4  0.5  1.0  0.5
[4,]  0.3  0.4  0.5  1.0
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.4373221 0.2979056 0.1453077
[2,] 0.4373221 1.0000000 0.4456337 0.3782520
[3,] 0.2979056 0.4456337 1.0000000 0.3892966
[4,] 0.1453077 0.3782520 0.3892966 1.0000000
   1    2    3    4    5 
0.22 0.45 0.62 0.81 1.00 
     [,1] [,2] [,3] [,4]
[1,]    1    3    3    5
[2,]    2    1    1    4
[3,]    1    3    2    5
[4,]    1    1    1    1
[5,]    1    2    4    2
[6,]    2    3    1    4
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.6576809 0.5230593 0.3920468
[2,] 0.6576809 1.0000000 0.5853667 0.4669768
[3,] 0.5230593 0.5853667 1.0000000 0.5587005
[4,] 0.3920468 0.4669768 0.5587005 1.0000000
     [,1] [,2] [,3] [,4]
[1,]    1    3    3    5
[2,]    2    1    1    4
[3,]    1    3    2    5
[4,]    1    1    1    1
[5,]    1    2    4    2
[6,]    2    3    1    4
2 x 2 Matrix of class "lgeMatrix"
     [,1] [,2]
[1,] TRUE TRUE
[2,] TRUE TRUE
          [,1]      [,2]
[1,] 1.0000000 0.2628508
[2,] 0.2628508 1.0000000
          [,1]      [,2]
[1,] 1.0000000 0.2639675
[2,] 0.2639675 1.0000000
     [,1] [,2]
[1,]    4    6
[2,]    5    8
[3,]    1    4
[4,]    2    3
[5,]    1    4
[6,]    2    4
4 x 4 Matrix of class "lgeMatrix"
     [,1] [,2] [,3] [,4]
[1,] TRUE TRUE TRUE TRUE
[2,] TRUE TRUE TRUE TRUE
[3,] TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.3826825 0.3259881 0.3178154
[2,] 0.3826825 1.0000000 0.3259881 0.3178154
[3,] 0.3259881 0.3259881 1.0000000 0.2628508
[4,] 0.3178154 0.3178154 0.2628508 1.0000000
          [,1]      [,2]      [,3]      [,4]
[1,] 1.0000000 0.2500256 0.2384720 0.2540702
[2,] 0.2500256 1.0000000 0.2294206 0.2500565
[3,] 0.2384720 0.2294206 1.0000000 0.2626674
[4,] 0.2540702 0.2500565 0.2626674 1.0000000
     [,1] [,2] [,3] [,4]
[1,]    1    1    4    8
[2,]    1    1    1    1
[3,]    0    0    0    2
[4,]    0    0    2    9
[5,]    0    1    3    6
[6,]    1    0    2    6

GenOrd documentation built on May 2, 2019, 8:16 a.m.

GenOrd index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

GenOrd
Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

ordsample: Drawing a sample of discrete data
In GenOrd: Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Example output

Related to ordsample in GenOrd...

R Package Documentation

Browse R Packages

We want your feedback!

GenOrd Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

ordsample: Drawing a sample of discrete data In GenOrd: Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Example output

Related to ordsample in GenOrd...

R Package Documentation

Browse R Packages

We want your feedback!

GenOrd
Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions

ordsample: Drawing a sample of discrete data
In GenOrd: Simulation of Discrete Random Variables with Given Correlation Matrix and Marginal Distributions