ggbiplot | R Documentation |
Build a biplot visualization from ordination data wrapped as a tbl_ord object.
ggbiplot( ordination = NULL, mapping = aes(x = 1, y = 2), axis.type = "interpolative", xlim = NULL, ylim = NULL, expand = TRUE, clip = "on", axis.percents = TRUE, sec.axes = NULL, scale.factor = NULL, scale_rows = NULL, scale_cols = NULL, ... ) ord_aes(ordination, ...)
ordination |
A tbl_ord. |
mapping |
List of default aesthetic mappings to use for the biplot. The
default assigns the first two coordinates to the aesthetics |
axis.type |
Character, partially matched; whether to build an
|
xlim, ylim |
Limits for the x and y axes. |
expand |
If |
clip |
Should drawing be clipped to the extent of the plot panel? A
setting of |
axis.percents |
Whether to concatenate default axis labels with inertia percentages. |
sec.axes |
Matrix factor character to specify a secondary set of axes. |
scale.factor |
Numeric value used to scale the secondary axes against
the primary axes; ignored if |
scale_rows, scale_cols |
Either the character name of a numeric variable
in |
... |
Additional arguments passed to |
ggbiplot()
produces a ggplot object from a tbl_ord
object ordination
. The baseline object is the default unadorned
"ggplot"
-class object p
with the following differences from what
ggplot2::ggplot()
returns:
p$mapping
is augmented with .matrix = .matrix
, which expects either
.matrix = "rows"
or .matrix = "cols"
from the biplot.
p$coordinates
is defaulted to ggplot2::coord_equal()
in order to
faithfully render the geometry of an ordination. The optional parameters
xlim
, ylim
, expand
, and clip
are passed to coord_equal()
and
default to its ggplot2 defaults.
When x
or y
are mapped to coordinates of ordination
, and if
axis.percents
is TRUE
, p$labels$x
or p$labels$y
are defaulted to the
coordinate names concatenated with the percentages of inertia
captured by the coordinates.
p
is assigned the class "ggbiplot"
in addition to "ggplot"
. This
serves no functional purpose currently.
Furthermore, the user may feed single integer values to the x
and y
aesthetics, which will be interpreted as the corresponding coordinates in the
ordination. Currently only 2-dimensional biplots are supported, so both x
and y
must take coordinate values.
ord_aes()
is a convenience function that generates a full-rank set of
coordinate aesthetics ..coord1
, ..coord2
, etc. mapped to the shared
coordinates of the ordination object, along with any additional aesthetics
that are processed internally by ggplot2::aes()
.
The axis.type
parameter controls whether the biplot is interpolative or
predictive, though predictive biplots are still experimental and limited to
linear methods like PCA. Gower & Hand (1996) and Gower, Gardner–Lubbe, & le
Roux (2011) thoroughly explain the construction and interpretation of
predictive biplots.
A ggplot object.
ggbiplot()
uses ggplot2::fortify()
internally to produce a single data
frame with a .matrix
column distinguishing the subjects ("rows"
) and
variables ("cols"
). The stat layers stat_rows()
and stat_cols()
simply
filter the data frame to one of these two.
The geom layers geom_rows_*()
and geom_cols_*()
call the corresponding
stat in order to render plot elements for the corresponding factor matrix.
geom_dims_*()
selects a default matrix based on common practice, e.g.
points for rows and arrows for columns.
Gower JC & Hand DJ (1996) Biplots. Chapman & Hall, ISBN: 0-412-71630-5.
Gower JC, Gardner–Lubbe S, & le Roux NJ (2011) Understanding Biplots. Wiley, ISBN: 978-0-470-01255-0. https://www.wiley.com/go/biplots
ggplot2::ggplot2()
, on which ggbiplot()
is built
# compute PCA of Anderson iris measurements iris[, -5] %>% princomp(cor = TRUE) %>% as_tbl_ord() %>% confer_inertia(1) %>% mutate_rows(species = iris$Species) %>% mutate_cols(measure = gsub("\\.", " ", tolower(names(iris)[-5]))) %>% print() -> iris_pca # row-principal biplot with rescaled secondary axis iris_pca %>% ggbiplot(aes(color = species), sec.axes = "cols", scale.factor = 2) + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_rows_point() + geom_cols_vector(color = "#444444") + geom_cols_text_radiate(aes(label = measure), color = "#444444") + ggtitle( "Row-principal PCA biplot of Anderson iris measurements", "Variable loadings scaled to secondary axes" ) + expand_limits(y = c(-1, 3.5)) # Performance measures can be regressed on the artificial coordinates of # ordinated vehicle specs. Because the ordination of specs ignores performance, # these coordinates will probably not be highly predictive. The gradient of each # performance measure along the artificial axes is visualized by projecting the # regression coefficients onto the ordination biplot. # scaled principal components analysis of vehicle specs mtcars_specs_pca <- ordinate( mtcars, cols = c(cyl, disp, hp, drat, wt, vs, carb), model = ~ princomp(., cor = TRUE) ) # data frame of vehicle performance measures mtcars %>% subset(select = c(mpg, qsec)) %>% as.matrix() %>% print() -> mtcars_perf # regress performance measures on principal components lm(mtcars_perf ~ get_rows(mtcars_specs_pca)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_pca_lm # regression biplot ggbiplot(mtcars_specs_pca, aes(label = name), sec.axes = "rows", scale.factor = .5) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_pca_lm) + geom_cols_text_radiate(data = mtcars_pca_lm) + expand_limits(x = c(-2.5, 2)) # multidimensional scaling based on a scaled cosine distance of vehicle specs cosine_dist <- function(x) { x <- as.matrix(x) num <- x %*% t(x) denom_rt <- as.matrix(rowSums(x^2)) denom <- sqrt(denom_rt %*% t(denom_rt)) as.dist(1 - num / denom) } mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale() %>% as.data.frame() -> mtcars_specs_cmds # names must be consistent with `cmdscale_ord()` below names(mtcars_specs_cmds) <- c("PCo1", "PCo2") # regress performance measures on principal coordinates lm(mtcars_perf ~ as.matrix(mtcars_specs_cmds)) %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_cmds_lm # multidimensional scaling using `cmdscale_ord()` mtcars %>% subset(select = c(cyl, disp, hp, drat, wt, vs, carb)) %>% scale() %>% cosine_dist() %>% cmdscale_ord() %>% as_tbl_ord() %>% augment_ord() %>% print() -> mtcars_specs_cmds_ord # regression biplot ggbiplot(mtcars_specs_cmds_ord, aes(label = name), sec.axes = "rows", scale.factor = 3) + theme_minimal() + geom_rows_text(size = 3) + geom_cols_vector(data = mtcars_cmds_lm) + geom_cols_text_radiate(data = mtcars_cmds_lm) + expand_limits(x = c(-2.25, 1.25), y = c(-2, 1.5)) # PCA of iris data iris_pca <- ordinate(iris, cols = 1:4, prcomp, scale = TRUE) # row-principal predictive biplot iris_pca %>% augment_ord() %>% ggbiplot(axis.type = "predictive") + theme_bw() + scale_color_brewer(type = "qual", palette = 2) + geom_cols_axis(aes(label = name, center = center, scale = scale)) + geom_rows_point(aes(color = Species), alpha = .5) + ggtitle("Predictive biplot of Anderson iris measurements")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.