do_pca: Do PCA on Embeddings

Description Usage Arguments Value Examples

View source: R/visualize_embeddings.R

Description

Calculate the 2D PCA for a collection of embedding vectors, in preparation for plotting.

Usage

1
2
3
4
5
do_pca(
  embedding_df,
  project_vectors = embedding_df,
  disambiguate_tokens = TRUE
)

Arguments

embedding_df

A tbl_df of embedding vectors; from the output of extract_features.

project_vectors

A tbl_df of embedding vectors to be used for calculating the PCA projection matrix. Defaults to embedding_df. This makes it possible to more consistently select the PCA "perspective", even as the set of vectors may change.

disambiguate_tokens

Logical; whether to append example and token index to the literal token for display purposes.

Value

A tbl_df of the embedding vectors projected onto two principal axes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
# assuming something like the following has been run:
# feats <- RBERT::extract_features(...) # See RBERT documentation
# Then:
pca_df <- feats$output %>%
    filter_layer_embeddings(layer_indices = 12L) %>%
    keep_tokens("[CLS]") %>%
    do_pca()

## End(Not run)

jonathanbratt/RBERTviz documentation built on June 19, 2021, 6:27 p.m.