knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Recall that a random variable $X$ has (univariate) negative binomial law with parameters $\kappa>0, 0<p<1$, i.e., $X\sim \text{NB}(\kappa, p)$ if its probability mass function is given by $$ P(X=x) = {\kappa+x-1 \choose x} p^x(1-p)^{\kappa}, \quad x \in {0,1,\ldots}. $$
A random vector ${\bf X}=(X_1,\dots,X_n)'$ is said to follow the multivariate negative binomial distribution with parameters $\kappa, p_1, \dots, p_n$ if its probability mass function is given by
$$
P({\bf X}={\bf x})=\frac{\Gamma(x_1+\cdots+x_n+\kappa)}{x_1! \cdots x_n! \Gamma(\kappa)}p_1^{x_1}\cdots p_n^{x_n}(1-p_1-\cdots-p_n)^{\kappa},
$$
where, for $i=1,\dots,n$, $x_i\in{0,1,\dots}$, $0
We note that the function stats::rnbinom
can be used to simulate from the univariate negative binomial distribution. The trawl
package introduces the function Bivariate_NBsim
which generates samples from the bivariate negative binomial distribution. The simulation algorithm proceeds in two steps:
First, we simulate $X_1$ from the univariate negative binomial distribution NB($\kappa$,$p_1/(1-p_2)$).
Second, we simulate $X_2|X_1=x_1$ from the univariate negative binomial distribution NB($\kappa+x_1,p_2$), see for instance [@Dunn1967].
set.seed(1) kappa<- 3 p1 <- 0.1 p2 <- 0.85 p <- p1+p2 p0 <-1-p1-p2 N<- 10000 #Simulate from the bivariate negative binomial distribution y <- trawl::Bivariate_NBsim(N,kappa,p1,p2) #Compare the empirical and theoretical mean of the first component base::mean(y[,1]) kappa*p1/(1-p) #Compare the empirical and theoretical variance of the first component stats::var(y[,1]) kappa*p1*(1-p2)/(1-p)^2 #Compare the empirical and theoretical mean of the second component base::mean(y[,2]) kappa*p2/(1-p) #Compare the empirical and theoretical variance of the second component stats::var(y[,2]) kappa*p2*(1-p1)/(1-p)^2 #Compare the empirical and theoretical correlation between the two components stats::cor(y[,1],y[,2]) (p1*p2/(p0+p1)*(p0+p2))^(1/2)
We say that a vector ${\bf X}=(X_1,\dots,X_n)'$ follows the multivariate logarithmic series distribution (LSD), see, e.g., [@GB1967].
${\bf X} \sim \text{LSD}(p_1,\ldots,p_n)$, where $0<p_i<1, p:=\sum_{i=1}^np_i <1$ if for ${\bf x}\in \mathbb{N}_0^n \setminus {{\bf 0} }$, if its probability mass function is given by
$$
P({\bf X}={\bf x})=\frac{\Gamma(x_1+\cdots+x_n)}{x_1! \cdots x_n!}\frac{p_1^{x_1}\cdots p_n^{x_n}}{{-\ln(1-p)}}.
$$
Note that each component $X_i$ follows the modified univariate logarithmic distribution with parameters $\tilde p_i = p_i/(1-p+p_i)$ and $\delta_i= \ln(1-p+p_i)/\ln(1-p)$, i.e.,
$X_i\sim\text{ModLSD}(\tilde p_i, \delta_i)$ with
$$
P(X_i =x_i) = \left{ \begin{array}{ll} \delta_i, & \text{for } x_i=0\
(1-\delta_i) \frac{1}{x_i}\frac{\tilde p_i^{x_i}}{{-\ln(1-\tilde p_i)}}, & \text{for } x_i \in \mathbb{N}. \end{array} \right.
$$
Simulations from the univariate LSD can be carried out using the function Runuran::urlogarithmic
. The trawl
package implements the functions Bivariate_LSDsim
and Trivariate_LSDsim
to simulate from both the bivariate and the trivariate logarithmic series distribution.
The function Bivariate_NBsim
can be used to simulate from the bivariate logarithmic series distribution. To this end, note that
the probability mass function of a random vector ${\bf X}=(X_1,X_2)'$
following the bivariate logarithmic series distribution with parameters
$0
The simulation proceeds in two steps: First, $X_1$ is simulated from the modified logarithmic distribution with parameters $\tilde p_1=p_1/(1-p_2)$ and $\delta_1=\ln(1-p_2)/\ln(1-p)$. Then we simulate $X_2$ conditional on $X_1$. We note that $X_2|X_1=x_1$ follows the logarithmic series distribution with parameter $p_2$ when $x_1=0$, and the negative binomial distribution with parameters $(x_1,p_2)$ when $x_1>0$.
Next we provide an example of a simulation from the bivariate LSD and we showcase the functions
ModLSD_Mean
, ModLSD_Var
, BivLSD_Cor
and BivLSD_Cov
which compute the mean and the variance of the univariate modified LSD and the correlation and covariance of the bivariate LSD, respectively.
set.seed(1) p1<-0.15 p2<-0.3 N<-10000 #Simulate N realisations from the bivariate LSD y<-trawl::Bivariate_LSDsim(N, p1, p2) #Compute the empirical and theoretical mean of the first component base::mean(y[,1]) trawl::ModLSD_Mean(base::log(1-p2)/base::log(1-p1-p2),p1/(1-p2)) #Compute the empirical and theoretical mean of the second component base::mean(y[,2]) trawl::ModLSD_Mean(base::log(1-p1)/base::log(1-p1-p2),p2/(1-p1)) #Compute the empirical and theoretical variance of the first component stats::var(y[,1]) trawl::ModLSD_Var(base::log(1-p2)/base::log(1-p1-p2),p1/(1-p2)) #Compute the empirical and theoretical variance of the second component stats::var(y[,2]) trawl::ModLSD_Var(base::log(1-p1)/base::log(1-p1-p2),p2/(1-p1)) ##Compute the empirical and theoretical correlation between the two components stats::cor(y[,1],y[,2]) trawl::BivLSD_Cor(p1,p2) ##Compute the empirical and theoretical covariance between the two components stats::cov(y[,1],y[,2]) trawl::BivLSD_Cov(p1,p2)
The function Trivariate_NBsim
can be used to simulate from the trivariate logarithmic series distribution.
The simulation proceeds in two steps: First, $X_1$ is simulated from the
modified logarithmic distribution with parameters $\tilde
p_1=p_1/(1-p_2-p_3)$ and $\delta_1=\ln(1-p_2-p_3)/\ln(1-p)$. Then we
simulate $(X_2,X_3)'$ conditional on $X_1$. We note that
$(X_2,X_3)'|X_1=x_1$ follows the bivariate logarithmic series
distribution with paramaters $(p_2,p_3)$ when $x_1=0$, and the
bivariate negative binomial distribution with parameters $(x_1,p_2,p_3)$
when $x_1>0$.
set.seed(1) p1<-0.15 p2<-0.25 p3<-0.55 N<- 10000 #Simulate N realisations from the bivariate LSD y<-trawl::Trivariate_LSDsim(N, p1, p2, p3) #Compute the empirical and theoretical mean of the first component base::mean(y[,1]) trawl::ModLSD_Mean(base::log(1-p2-p3)/base::log(1-p1-p2-p3),p1/(1-p2-p3)) #Compute the empirical and theoretical mean of the second component base::mean(y[,2]) trawl::ModLSD_Mean(base::log(1-p1-p3)/base::log(1-p1-p2-p3),p2/(1-p1-p3)) #Compute the empirical and theoretical mean of the third component base::mean(y[,3]) trawl::ModLSD_Mean(base::log(1-p1-p2)/base::log(1-p1-p2-p3),p3/(1-p1-p2)) #Compute the empirical and theoretical variance of the first component stats::var(y[,1]) trawl::ModLSD_Var(base::log(1-p2-p3)/base::log(1-p1-p2-p3),p1/(1-p2-p3)) #Compute the empirical and theoretical variance of the second component stats::var(y[,2]) trawl::ModLSD_Var(base::log(1-p1-p3)/base::log(1-p1-p2-p3),p2/(1-p1-p3)) #Compute the empirical and theoretical variance of the third component stats::var(y[,3]) trawl::ModLSD_Var(base::log(1-p1-p2)/base::log(1-p1-p2-p3),p3/(1-p1-p2)) #Computing the bivariate covariances and correlations #Cor(X1,X2): delta <- base::log(1-p3)/base::log(1-p1-p2-p3) hatp1 <-p1/(1-p3) hatp2<-p2/(1-p3) stats::cov(y[,1],y[,2]) trawl::BivModLSD_Cov(delta,hatp1,hatp2) stats::cor(y[,1],y[,2]) trawl::BivModLSD_Cor(delta,hatp1,hatp2) #Cor(X1,X3): delta <- log(1-p2)/log(1-p1-p2-p3) hatp1 <-p1/(1-p2) hatp2<-p3/(1-p2) stats::cov(y[,1],y[,3]) trawl::BivModLSD_Cov(delta,hatp1,hatp2) stats::cor(y[,1],y[,3]) trawl::BivModLSD_Cor(delta,hatp1,hatp2) #Cor(X2,X3): delta <- log(1-p1)/log(1-p1-p2-p3) hatp1 <-p2/(1-p1) hatp2<-p3/(1-p1) stats::cov(y[,2],y[,3]) trawl::BivModLSD_Cov(delta,hatp1,hatp2) stats::cor(y[,2],y[,3]) trawl::BivModLSD_Cor(delta,hatp1,hatp2)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.