calculateIdf: Creates a table with baseline IDF stats

View source: R/tidyTFIDF.R

calculateIdfR Documentation

Creates a table with baseline IDF stats

Description

This is a summarise operation. for concepts given a set of samples given a dataframe whose grouping defines the concept to calculate IDF stats for

Usage

calculateIdf(groupedDf, sampleVars, countVar = NULL, totalSamples = NA)

Arguments

groupedDf

a dataframe whose grouping defines the "term" for which we calulate the IDF

sampleVars

the column(s) that contains the unique id of a sample, i.e. traditionally a "document" but could be a patient. escaped by vars(...)

countVar

(optional) a field that contains a count. If this is given then it is assumed that the concept & document combinations are unique

totalSamples

if the data is incomplete (not every document has a concept in it then the expected number fo samples can be specified here)

Value

a data frame with idf stats for each concept in each group (i.e. document)


terminological/tidy-info-stats documentation built on Nov. 19, 2022, 11:23 p.m.