mutInfo: Calculate Mutual Information

View source: R/mutInfo.R

mutInfoR Documentation

Calculate Mutual Information

Description

Calculate mutual information between training classes and training features

Usage

mutInfo(coding, train_matrix)

Arguments

coding

The numeric vector of codings

train_matrix

A quanteda document-feature matrix with the number of rows equal to the length of coding

Value

A numeric vector the same length as features(train_matrix)

Author(s)

Matt W. Loftis

Examples

## Load data and create document-feature matrices
train_corpus <- quanteda::corpus(x = training_agendas$text)
train_matrix <- quanteda::dfm(train_corpus,
                    language = "danish",
                    stem = TRUE,
                    removeNumbers = FALSE)

 ## Mutual information algorithm for feature selection
 mut.info <- mutInfo(training_agendas$coding, train_matrix)
 cutoff <- quantile(mut.info, .8) #Set cutoff quantile for mutual information
 train_matrix <- train_matrix[, mut.info > cutoff] #Pare down training set


mattwloftis/agendacodeR documentation built on June 5, 2023, 7 p.m.