semnet | R Documentation |
This function calculates the co-occurence of features and returns a network/graph in the igraph format, where nodes are tokens and edges represent the similarity/adjacency of tokens. Co-occurence is calcuated based on how often two tokens occured within the same document (e.g., news article, chapter, paragraph, sentence). The semnet_window() function can be used to calculate co-occurrence of tokens within a given token distance.
semnet(
tc,
feature = "token",
measure = c("con_prob", "con_prob_weighted", "cosine", "count_directed",
"count_undirected", "chi2"),
context_level = c("document", "sentence"),
backbone = F,
n.batches = NA
)
tc |
a tCorpus or a featureHits object (i.e. the result of search_features) |
feature |
The name of the feature column |
measure |
The similarity measure. Currently supports: "con_prob" (conditional probability), "con_prob_weighted", "cosine" similarity, "count_directed" (i.e number of cooccurrences) and "count_undirected" (same as count_directed, but returned as an undirected network, chi2 (chi-square score)) |
context_level |
Determine whether features need to co-occurr within "documents" or "sentences" |
backbone |
If True, add an edge attribute for the backbone alpha |
n.batches |
If a number, perform the calculation in batches |
an Igraph graph in which nodes are features and edges are similarity scores
text = c('A B C', 'D E F. G H I', 'A D', 'GGG')
tc = create_tcorpus(text, doc_id = c('a','b','c','d'), split_sentences = TRUE)
g = semnet(tc, 'token')
g
igraph::get.data.frame(g)
plot_semnet(g)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.