knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 7, out.width = NULL )
library(dimensio)
## Load data data(iris) head(iris) ## Compute PCA ## (non numeric variables are automatically removed) X <- pca(iris, center = TRUE, scale = TRUE)
dimensio provides several methods to extract (get_*()
) the results:
get_data()
returns the original data.get_contributions()
returns the contributions to the definition of the principal dimensions.get_coordinates()
returns the principal or standard coordinates.get_correlations()
returns the correlations between variables and dimensions.get_cos2()
returns the cos^2^ values (i.e. the quality of the representation of the points on the factor map).get_eigenvalues()
returns the eigenvalues, the percentages of variance and the cumulative percentages of variance.The package also allows to quickly visualize (viz_*()
) the results:
biplot()
produces a biplot.screeplot()
produces a scree plot.viz_rows()
/viz_individuals()
displays row/individual principal coordinates.viz_columns()
/viz_variables()
displays columns/variable principal coordinates.viz_contributions()
displays (joint) contributions. viz_cos2()
displays (joint) cos^2^.## Get eigenvalues get_eigenvalues(X) ## Scree plot screeplot(X, cumulative = TRUE) ## Plot variable contributions to the definition of the first two axes viz_contributions(X, margin = 2, axes = c(1, 2))
A biplot is the simultaneous representation of rows and columns of a rectangular dataset. It is the generalization of a scatterplot to the case of mutlivariate data: it allows to visualize as much information as possible in a single graph [@greenacre2010].
dimensio allows to display two types of biplots: a form biplot (row-metric-preserving biplot) or a covariance biplot (column-metric-preserving biplot). See @greenacre2010 for more details about biplots.
The form biplot favors the representation of the individuals: the distance between the individuals approximates the Euclidean distance between rows. In the form biplot the length of a vector approximates the quality of the representation of the variable.
biplot(X, type = "form", label = "variables")
The covariance biplot favors the representation of the variables: the length of a vector approximates the standard deviation of the variable and the cosine of the angle formed by two vectors approximates the correlation between the two variables [@greenacre2010]. In the covariance biplot the distance between the individuals approximates the Mahalanobis distance between rows.
biplot(X, type = "covariance", label = "variables")
Biplots have the drawbacks of their advantages: they can quickly become difficult to read as they display a lot of information at once. It may then be preferable to visualize the results for individuals and variables separately.
viz_variables()
depicts the variables by rays emanating from the origin (both their lengths and directions are important to the interpretation).
## Plot variables factor map viz_variables(X)
viz_variables()
allows to highlight additional information by varying different graphical elements (color, transparency, shape and size of symbols...).
## Highlight cos2 viz_variables( x = X, highlight = "cos2", col = khroma::color("YlOrBr")(4, range = c(0.5, 1)), legend = list(x = "bottomleft") )
viz_individuals()
allows to display individuals and to highlight additional information.
## Plot individuals and color by species viz_individuals( x = X, highlight = iris$Species, col = khroma::color("bright")(3), # Custom color scale pch = c(15, 16, 17), # Custom symbols legend = list(x = "bottomright") )
## Add ellipses viz_individuals(x = X) viz_tolerance(x = X, group = iris$Species, level = 0.95, border = khroma::color("high contrast")(3)) ## Add convex hull viz_individuals(x = X) viz_hull(x = X, group = iris$Species, level = 0.95, border = khroma::color("high contrast")(3))
## Highlight petal length viz_individuals( x = X, highlight = iris$Petal.Length, col = khroma::color("YlOrBr")(12), # Custom color scale cex = c(1, 2), # Custom size scale pch = 16, legend = list(x = "bottomleft") )
## Highlight contributions viz_individuals( x = X, highlight = "contrib", col = khroma::color("iridescent")(12), # Custom color scale cex = c(1, 2), # Custom size scale pch = 16, legend = list(x = "bottomleft") )
If you need more flexibility, the get_*()
family and the tidy()
and augment()
functions allow you to extract the results as data frames and thus build custom graphs with base graphics or ggplot2.
iris_tidy <- tidy(X, margin = 2) head(iris_tidy) iris_augment <- augment(X, margin = 1) head(iris_augment)
## Custom plot with ggplot2 ggplot2::ggplot(data = iris_augment) + ggplot2::aes(x = F1, y = F2, colour = contribution) + ggplot2::geom_vline(xintercept = 0, linewidth = 0.5, linetype = "dashed") + ggplot2::geom_hline(yintercept = 0, linewidth = 0.5, linetype = "dashed") + ggplot2::geom_point() + ggplot2::coord_fixed() + # /!\ ggplot2::theme_bw() + khroma::scale_color_iridescent()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.