Description Usage Arguments Details Value Examples
View source: R/topic_modeling_utilities.R
Create a data frame summarizing the contents of each topic in a model
1 | SummarizeTopics(model)
|
model |
A list (or S3 object) with three named matrices: phi, theta, and gamma. These conform to outputs of many of textmineR's native topic modeling functions such as FitLdaModel. |
'prevalence' is normalized to sum to 100. If your 'theta' matrix has negative values (as may be the case with an LSA model), a constant is added so that the least prevalent topic has a prevalence of 0.
'coherence' is calculated using CalcProbCoherence.
'label' is assigned using the top label from LabelTopics. This requires an "assignment" matrix. This matrix is like a "theta" matrix except that it is binary. A topic is "in" a document or it is not. The assignment is made by comparing each value of theta to the minimum of the largest value for each row of theta (each document). This ensures that each document has at least one topic assigned to it.
An object of class data.frame
or tibble
with 6 columns: 'topic' is the
name of the topic, 'prevalence' is the rough prevalence of the topic
in all documents across the corpus, 'coherence' is the probabilistic
coherence of the topic, 'top_terms_phi' are the top 5 terms for each
topic according to P(word|topic), 'top_terms_gamma' are the top 5 terms
for each topic according to P(topic|word).
1 2 3 4 | ## Not run:
SummarizeTopics(nih_sample_topic_model)
## End(Not run)
|
Loading required package: Matrix
Attaching package: 'textmineR'
The following object is masked from 'package:Matrix':
update
The following object is masked from 'package:stats':
update
dtm does not appear to contain ngrams. Using unigrams but ngrams will work much better.
topic label_1 prevalence coherence
t_1 t_1 health 2.81 0.054
t_2 t_2 cells 3.34 0.413
t_3 t_3 diabetes 2.72 0.167
t_4 t_4 cmybp 2.78 0.198
t_5 t_5 phd 3.30 0.154
t_6 t_6 infection 2.37 0.264
t_7 t_7 risk 3.60 0.247
t_8 t_8 mitochondrial 3.09 0.262
t_9 t_9 ma 3.29 0.165
t_10 t_10 research 4.53 0.091
t_11 t_11 cell 3.68 0.059
t_12 t_12 tumor 3.80 0.216
t_13 t_13 dna 4.20 0.176
t_14 t_14 imaging 3.75 0.112
t_15 t_15 cells 3.67 0.357
t_16 t_16 influenza 3.30 0.201
t_17 t_17 intervention 3.12 0.243
t_18 t_18 mast 2.05 0.486
t_19 t_19 treatment 3.65 0.153
t_20 t_20 sleep 3.27 0.377
t_21 t_21 microbiome 2.17 0.388
t_22 t_22 dr 3.73 0.032
t_23 t_23 research 3.20 0.044
t_24 t_24 ipf 3.08 0.240
t_25 t_25 rna 4.43 0.054
t_26 t_26 core 3.93 0.122
t_27 t_27 research 4.05 0.168
t_28 t_28 inflammation 3.01 0.085
t_29 t_29 difficile 3.78 0.049
t_30 t_30 develop 2.30 0.333
top_terms_phi
t_1 health, data, women, studies, swan
t_2 ptc, brain, metastatic, brafv, cells
t_3 diabetes, influenza, numeracy, vaccine, centralized
t_4 injury, cmybp, cdk, function, fragment
t_5 phd, hif, epithelial, model, project
t_6 muscle, sand, fly, infection, strength
t_7 risk, factors, sud, early, study
t_8 mitochondrial, metabolic, redox, tissue, radiation
t_9 ma, activity, aim, cortex, mice
t_10 research, program, cancer, students, prevention
t_11 cells, cell, specific, lung, brain
t_12 cancer, dcis, pancreatic, tumor, genetic
t_13 dna, rna, transcription, repair, structure
t_14 imaging, clinical, cancer, develop, time
t_15 cells, carbon, metabolism, intracellular, cell
t_16 response, hiv, env, antibodies, human
t_17 intervention, fertility, health, behavior, community
t_18 mast, cell, cells, fc, ri
t_19 treatment, methods, evaluation, clinical, develop
t_20 sleep, plasticity, synaptic, deficits, memory
t_21 microbiome, gut, crc, psoriasis, composition
t_22 dr, administrative, ucdc, research, te
t_23 health, research, hiv, disease, testing
t_24 ipf, lung, cns, expression, based
t_25 structural, activity, natural, including, nmdar
t_26 core, center, projects, data, research
t_27 research, core, center, investigators, support
t_28 inflammation, hiv, study, battery, capacity
t_29 genetic, difficile, extinction, pd, approach
t_30 wall, large, stiffening, effects, disease
top_terms_gamma
t_1 lepi, worker, mepi, biomechanical, mt
t_2 reprograms, vegf, sorafenib, chemotherapy, micrometastatic
t_3 immunologic, chlamydial, immunized, alaska, curricula
t_4 cleavage, cardiomyocytes, stabilizes, occlusion, hyperactivation
t_5 xenotransplantation, iv, heparin, pig, hs
t_6 parasitic, sarcopenia, vector, west, elderly
t_7 kendler, heavy, trajectories, nesarc, neurocognition
t_8 couples, ratios, rt, adipocyte, reflected
t_9 prefrontal, concerns, madr, impairments, arch
t_10 accepted, undergraduate, journals, actively, sponsored
t_11 allergen, reversible, ccr, asthma, multiphtoton
t_12 taste, glycomic, origin, glycoform, shows
t_13 genomes, tefb, elongation, pairing, ac
t_14 false, nanoparticles, partial, nanosensors, emission
t_15 virulence, shigella, adherence, plaques, cytoplasm
t_16 mabs, vlbw, birth, enteric, ab
t_17 births, hospitalized, youth, adjunctive, military
t_18 truncation, interpreted, tubulin, resulted, attenuated
t_19 rules, constructing, challenging, surveillance, accuracy
t_20 eeg, impairment, psychiatric, spindle, eszopiclone
t_21 psoriatic, baseline, lifestyle, fecal, nas
t_22 fo, teprorm, stnar, sc, vnrrps
t_23 abm, hsieh, grocery, hopkins, hub
t_24 encode, overarching, mrna, srt, mirnas
t_25 substituent, plms, nrs, antifungal, lactone
t_26 nsls, bnl, computing, instruments, ray
t_27 lipidomics, invertebrate, mdibl, ctsa, cobre
t_28 recharge, gadgets, myocyte, practically, mybp
t_29 seeking, vorinostat, ido, allogeneic, exciting
t_30 cyclic, temporally, perivascular, doxycycline, tone
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.