fit_embeds_to_pairs | R Documentation |
Fit an embeddings matrix to a dataframe of known pairs of related concepts. Depending on matrix dimension, either compute all pair-wise similarities, or only those existing in the known pairs.
fit_embeds_to_pairs(
m_embeds,
df_pairs,
df_pairs_cols = 1:2,
similarity = c("inprod", "cosine", "cov_simi", "norm_inprod"),
threshold_projs = 0.9,
max_concepts = 1000
)
m_embeds |
Embedding matrix, rownames must be able to be matched to concepts in df_pairs |
df_pairs |
Known relationships data frame |
df_pairs_cols |
Columns of df_pairs for identifiers, that map to m_embeds rownames |
similarity |
Similarity measure to be computed. One of 'inprod' (inner product), 'cosine', 'cov_simi' (covariance similarity), 'norm_inprod' (normalized inner product). |
threshold_projs |
Specificity threshold to use for projections. (default 0.9 is equivalent to 10 percent false positives, and 0.95 to 5 percent false positives) |
max_concepts |
Maximum number of concepts to compute all pair-wise similarities |
List object with slots roc (pROC::roc return), sims and truth (to recompute partial AUCs using pROC), threshold_5fp (5 percent false positive threshold), n_concepts (length of concepts in embeddings), and df_projs (data frame listing pair-wise concepts similarities above threshold_projs).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.