The goal of mvtraits
is to efficiently fit Bayesian (hierarchical)
multivariate models to trait data. A key feature is the ability to
perform multiple imputation of partially missing data.
This package is not currently available on CRAN. To install directly from this repository, do the following:
devtools::install_github("ashiklom/mvtraits")
This is a basic example showing how a multivariate distribution can be fit to the classic Iris dataset.
library(mvtraits)
data(iris)
The input data must be a matrix, with each trait in its own column.
Below, we drop the Species
column and convert the result to a matrix.
iris_mat <- as.matrix(iris[, !grepl("Species", colnames(iris))])
Fit a multivariate distribution with fit_mvnorm
:
iris_fit <- fit_mvnorm(iris_mat)
#> Running sampler...
#> All parameters have converged
#> Calculating correlation matrices...
#> Converting samples to coda mcmc.list object...
#> Preparing summary table...
#> Joining, by = "rowname"
str(iris_fit, max.level = 2)
#> List of 3
#> $ summary_table:Classes 'tbl_df', 'tbl' and 'data.frame': 20 obs. of 12 variables:
#> ..$ rowname : chr [1:20] "mu..Sepal.Length" "mu..Sepal.Width" "mu..Petal.Length" "mu..Petal.Width" ...
#> ..$ variable : chr [1:20] "mu" "mu" "mu" "mu" ...
#> ..$ index : chr [1:20] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" ...
#> ..$ Mean : num [1:20] 5.782 3.063 3.63 1.146 0.695 ...
#> ..$ SD : num [1:20] 0.0667 0.036 0.1456 0.0634 0.0817 ...
#> ..$ Naive SE : num [1:20] 0.000771 0.000415 0.001681 0.000732 0.000943 ...
#> ..$ Time-series SE: num [1:20] 0.000771 0.000415 0.001747 0.000732 0.000943 ...
#> ..$ 2.5% : num [1:20] 5.648 2.994 3.34 1.019 0.554 ...
#> ..$ 25% : num [1:20] 5.737 3.038 3.531 1.104 0.637 ...
#> ..$ 50% : num [1:20] 5.783 3.063 3.633 1.148 0.688 ...
#> ..$ 75% : num [1:20] 5.829 3.087 3.731 1.19 0.746 ...
#> ..$ 97.5% : num [1:20] 5.91 3.135 3.908 1.264 0.874 ...
#> $ stats :List of 3
#> ..$ mu :List of 3
#> ..$ Sigma:List of 3
#> ..$ Corr :List of 3
#> $ samples :List of 3
#> ..$ : 'mcmc' num [1:5000, 1:20] 5.82 5.74 5.79 5.82 5.74 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..$ : 'mcmc' num [1:5000, 1:20] 5.85 5.74 5.78 5.82 5.74 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..$ : 'mcmc' num [1:5000, 1:20] 5.83 5.75 5.78 5.82 5.73 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..- attr(*, "class")= chr "mcmc.list"
Compare the results to univariate summary statistics.
iris_fit[[c("stats", "mu", "Mean")]]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 5.782125 3.062608 3.630155 1.146483
colMeans(iris_mat)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 5.843333 3.057333 3.758000 1.199333
iris_fit[[c("stats", "Sigma", "Mean")]]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length 0.69497210 -0.04291286 1.2810738 0.5189763
#> Sepal.Width -0.04291286 0.19642316 -0.3308101 -0.1223293
#> Petal.Length 1.28107379 -0.33081015 3.1404326 1.3031064
#> Petal.Width 0.51897633 -0.12232934 1.3031064 0.5909668
cov(iris_mat)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length 0.6856935 -0.0424340 1.2743154 0.5162707
#> Sepal.Width -0.0424340 0.1899794 -0.3296564 -0.1216394
#> Petal.Length 1.2743154 -0.3296564 3.1162779 1.2956094
#> Petal.Width 0.5162707 -0.1216394 1.2956094 0.5810063
Fitting models also works with partially missing data. Here, we remove half of the data at random.
iris_mat2 <- iris_mat
nmat <- length(iris_mat2)
iris_mat2[sample.int(nmat, floor(nmat / 2))] <- NA
head(iris_mat2)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> [1,] 5.1 3.5 1.4 0.2
#> [2,] 4.9 3.0 NA 0.2
#> [3,] 4.7 3.2 NA 0.2
#> [4,] 4.6 3.1 1.5 0.2
#> [5,] 5.0 NA 1.4 0.2
#> [6,] NA NA 1.7 0.4
Now, re-fit the model.
iris_fit2 <- fit_mvnorm(iris_mat2)
#> Running sampler...
#> All parameters have converged
#> Calculating correlation matrices...
#> Converting samples to coda mcmc.list object...
#> Preparing summary table...
#> Joining, by = "rowname"
iris_fit2[[c("stats", "mu", "Mean")]]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> 5.790697 3.047071 3.632278 1.146587
iris_fit2[[c("stats", "Sigma", "Mean")]]
#> Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length 0.72814113 -0.07677293 1.3472674 0.5443821
#> Sepal.Width -0.07677293 0.17147239 -0.2328279 -0.1018946
#> Petal.Length 1.34726742 -0.23282791 3.1963764 1.3270015
#> Petal.Width 0.54438209 -0.10189459 1.3270015 0.6116293
mvtraits
can also fit hierarchical models with the fit_mvnorm_hier
function. The arguments are the same as fit_mvnorm
, but also need a
group
argument that takes a vector of groups (similar to base R’s
tapply
).
iris_fit2h <- fit_mvnorm_hier(iris_mat2, iris[["Species"]])
#> Running sampler...
#> All parameters have converged
#> Calculating correlation matrices...
#> Converting samples to coda mcmc.list object...
#> Preparing summary table...
#> Joining, by = "rowname"
str(iris_fit2h, max.level = 2)
#> List of 3
#> $ summary_table:Classes 'tbl_df', 'tbl' and 'data.frame': 80 obs. of 13 variables:
#> ..$ rowname : chr [1:80] "mu..global..Sepal.Length" "mu..global..Sepal.Width" "mu..global..Petal.Length" "mu..global..Petal.Width" ...
#> ..$ variable : chr [1:80] "mu" "mu" "mu" "mu" ...
#> ..$ group : chr [1:80] "global" "global" "global" "global" ...
#> ..$ index : chr [1:80] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" ...
#> ..$ Mean : num [1:80] 1.222 0.944 -0.068 -0.173 22.097 ...
#> ..$ SD : num [1:80] 1.029 0.926 0.962 0.813 27.465 ...
#> ..$ Naive SE : num [1:80] 0.01188 0.01069 0.0111 0.00939 0.31708 ...
#> ..$ Time-series SE: num [1:80] 0.0317 0.0347 0.0236 0.0326 0.4221 ...
#> ..$ 2.5% : num [1:80] -0.867 -0.957 -1.983 -1.77 3.22 ...
#> ..$ 25% : num [1:80] 0.545 0.323 -0.694 -0.714 8.589 ...
#> ..$ 50% : num [1:80] 1.234 0.98 -0.069 -0.164 14.442 ...
#> ..$ 75% : num [1:80] 1.923 1.594 0.58 0.356 25.147 ...
#> ..$ 97.5% : num [1:80] 3.24 2.66 1.82 1.43 88.81 ...
#> $ stats :List of 6
#> ..$ mu_global :List of 3
#> ..$ Sigma_global:List of 3
#> ..$ Corr_global :List of 3
#> ..$ mu_group :List of 3
#> ..$ Sigma_group :List of 3
#> ..$ Corr_group :List of 3
#> $ samples :List of 3
#> ..$ : 'mcmc' num [1:5000, 1:80] 2.01 1.67 2.01 3.06 2.51 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..$ : 'mcmc' num [1:5000, 1:80] 2.93 3.86 3.47 3.89 3.29 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..$ : 'mcmc' num [1:5000, 1:80] 2.42 3.57 4.16 4.31 4.27 ...
#> .. ..- attr(*, "dimnames")=List of 2
#> .. ..- attr(*, "mcpar")= num [1:3] 1 5000 1
#> ..- attr(*, "class")= chr "mcmc.list"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.